282
Views
19
CrossRef citations to date
0
Altmetric
Research Articles

iGluK-Deep: computational identification of lysine glutarylation sites using deep neural networks with general pseudo amino acid compositions

ORCID Icon, ORCID Icon, & ORCID Icon
Pages 11691-11704 | Received 15 May 2021, Accepted 24 Jul 2021, Published online: 16 Aug 2021

References

  • Arafat, Md., Ahmad, Md., Shovan, S. M., Dehzangi, A., Dipta, S. R., Hasan, Md. A. M., Taherzadeh, G., Shatabda, S., & Sharma, A. (2020). Accurately predicting glutarylation sites using sequential Bi-peptide-based evolutionary features. Genes, 11(9), 1023. https://doi.org/10.3390/genes11091023
  • Awais, M., Hussain, W., Daanial Khan, Y., Rasool, N., Khan, S. A., & Chou, K.-C. (2019). iPhosH-PseAAC: Identify phosphohistidine sites in proteins by blending statistical moments and position relative features according to the Chou’s 5-step rule and general pseudo amino acid composition. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(2), 596–610.
  • Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A., & Nielsen, H. (2000). Assessing the accuracy of prediction algorithms for classification: An overview. Bioinformatics, 16(5), 412–424. https://doi.org/10.1093/bioinformatics/16.5.412
  • Bao, X., Liu, Z., Zhang, W., Gladysz, K., Fung, Y. M. E., Tian, G., Xiong, Y., Wong, J. W. H., Yuen, K. W. Y., & Li, X. D. (2019). Glutarylation of histone H4 lysine 91 regulates chromatin dynamics. Molecular Cell, 76(4), 660–675. https://doi.org/10.1016/j.molcel.2019.08.018
  • Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166. https://doi.org/10.1109/72.279181
  • Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. JMLR, 13(1), 281–305.
  • Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
  • Cheng, Y.-m., Hu, X.-n., Peng, Z., Pan, T.-t., Wang, F., Chen, H.-y., Chen, W.-q., Zhang, Y., Zeng, X.-h., & Luo, T. (2019). Lysine glutarylation in human sperm is associated with progressive motility. Human Reproduction, 34(7), 1186–1194. https://doi.org/10.1093/humrep/dez068
  • Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6. https://doi.org/10.1186/s12864-019-6413-7
  • Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv Preprint arXiv:1409.1259.
  • Chou, K.-C. (2001). Using subsite coupling to predict signal peptides. Protein Engineering, 14(2), 75–79. https://doi.org/10.1093/protein/14.2.75
  • Chou, K.-C. (2011). Some remarks on protein attribute prediction and pseudo amino acid composition. Journal of Theoretical Biology, 273(1), 236–247. https://doi.org/10.1016/j.jtbi.2010.12.024
  • Cui, X., Yu, Z., Yu, B., Wang, M., Tian, B., & Ma, Q. (2019). UbiSitePred: A novel method for improving the accuracy of ubiquitination sites prediction by using LASSO to select the optimal Chou’s pseudo components. Chemometrics and Intelligent Laboratory Systems, 184, 28–43.
  • DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44(3), 837–845. https://doi.org/10.2307/2531595
  • Dou, L., Li, X., Zhang, L., Xiang, H., & Xu, L. (2021). iGlu_adaboost: Identification of lysine glutarylation using the AdaBoost classifier. Journal of Proteome Research, 20(1), 191–201. https://doi.org/10.1021/acs.jproteome.0c00314
  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
  • Furuya, E., & Uyeda, K. (1980). Regulation of phosphofructokinase by a new mechanism. An activation factor binding to phosphorylated enzyme. Journal of Biological Chemistry, 255(24), 11656–11659. https://doi.org/10.1016/S0021-9258(19)70181-1
  • Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747
  • Hirschey, M. D., & Zhao, Y. (2015). Metabolic regulation by lysine malonylation, succinylation, and glutarylation. Molecular & Cellular Proteomics, 14(9), 2308–2315. https://doi.org/10.1074/mcp.R114.046664
  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
  • Huang, K.-Y., Kao, H.-J., Hsu, J. B.-K., Weng, S.-L., & Lee, T.-Y. (2019). Characterization and identification of lysine glutarylation based on intrinsic interdependence between positions in the substrate sites. BMC Bioinformatics, 19(Suppl 13), 384. https://doi.org/10.1186/s12859-018-2394-9
  • Hussain, W., Khan, Y. D., Rasool, N., Khan, S. A., & Chou, K.-C. (2019a). SPalmitoylC-PseAAC: A sequence-based model developed via Chou’s 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins. Analytical Biochemistry, 568, 14–23. https://doi.org/10.1016/j.ab.2018.12.019
  • Hussain, W., Khan, Y. D., Rasool, N., Khan, S. A., & Chou, K.-C. (2019b). SPrenylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-prenylation sites in proteins. Journal of Theoretical Biology, 468, 1–11. https://doi.org/10.1016/j.jtbi.2019.02.007
  • Hussain, W., Qaddir, I., Mahmood, S., & Rasool, N. (2018). In silico targeting of non-structural 4B protein from dengue virus 4 with spiropyrazolopyridone: Study of molecular dynamics simulation, ADMET and virtual screening. Virusdisease, 29(2), 147. https://doi.org/10.1007/s13337-018-0446-4
  • Ju, Z., & He, J.-J. (2018). Prediction of lysine glutarylation sites by maximum relevance minimum redundancy feature selection. Analytical Biochemistry, 550, 1–7. https://doi.org/10.1016/j.ab.2018.04.005
  • Ju, Z., & Wang, S.-Y. (2020). Computational identification of lysine glutarylation sites using positive-unlabeled learning. Current Genomics, 21(3), 204–211. https://doi.org/10.2174/1389202921666200511072327
  • Khan, Y. D., Jamil, M., Hussain, W., Rasool, N., Khan, S. A., & Chou, K.-C. (2019). pSSbond-PseAAC: Prediction of disulfide bonding sites by integration of PseAAC and statistical moments. Journal of Theoretical Biology, 463, 47–55. https://doi.org/10.1016/j.jtbi.2018.12.015
  • Khan, Y. D., Rasool, N., Hussain, W., Khan, S. A., & Chou, K.-C. (2018). iPhosY-PseAAC: Identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC. Molecular Biology Reports, 45(1), 2501–2509.
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791
  • Li, F., Wang, Y., Li, C., Marquez-Lago, T. T., Leier, A., Rawlings, N. D., Haffari, G., Revote, J., Akutsu, T., & Chou, K.-C. (2018). Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: A comprehensive revisit and benchmarking of existing methods. Briefings in Bioinformatics, 20(6), 2150–2166.
  • Li, S.-H., Zhang, J., Zhao, Y.-W., Dao, F.-Y., Ding, H., Chen, W., & Tang, H. (2019). iPhoPred: A predictor for identifying phosphorylation sites in human protein. IEEE Access, 7, 177517–177528. https://doi.org/10.1109/ACCESS.2019.2953951
  • Lv, H., Dao, F.-Y., Guan, Z.-X., Yang, H., Li, Y.-W., & Lin, H. (2020). Deep-Kcr: Accurate detection of lysine crotonylation sites using deep learning method. Briefings in Bioinformatics, 22(4), 243–255.
  • Marzban, C. (2004). The ROC curve and the area under it as performance measures. Weather and Forecasting, 19(6), 1106–1114. https://doi.org/10.1175/825.1
  • Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta, 405(2), 442–451. https://doi.org/10.1016/0005-2795(75)90109-9
  • Naseer, S., & Saleem, Y. (2018). Enhanced network intrusion detection using deep convolutional neural networks. KSII Transactions on Internet and Information Systems, 12(10), 5159–5178.
  • Naseer, S., Ali, R. F., Fati, S. M., & Muneer, A. (2021). iNitroY-Deep: Computational identification of nitrotyrosine sites to supplement carcinogenesis studies using deep learning. IEEE Access, 9, 73624–73640. https://doi.org/10.1109/ACCESS.2021.3080041
  • Naseer, S., Ali, R. F., Muneer, A., & Fati, S. M. (2021). iAmideV-Deep: Valine amidation site prediction in proteins using deep learning and pseudo amino acid compositions. Symmetry, 13(4), 560. https://doi.org/10.3390/sym13040560
  • Naseer, S., Ali, R. F., Dominic, P. D. D., & Saleem, Y. (2020). Learning representations of network traffic using deep neural networks for network anomaly detection: A perspective towards oil and gas it infrastructures. Symmetry, 12(11), 1882. https://doi.org/10.3390/sym12111882
  • Naseer, S., Hussain, W., Daanial Khan, Y., & Rasool, N. (2020a). iPhosS(Deep)-PseAAC: Identify phosphoserine sites in proteins using deep learning on general pseudo amino acid compositions via modified 5-steps rule. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 14(8), 1–12.
  • Naseer, S., Hussain, W., Daanial Khan, Y., & Rasool, N. (2020b). NPalmitoylDeep-PseAAC: A predictor for N-palmitoylation sites in proteins using deep representations of proteins and PseAAC via modified 5-steps rule. Current Bioinformatics, 16(2), 294–305. https://doi.org/10.2174/1574893615666200129110450
  • Naseer, S., Hussain, W., Daanial Khan, Y., & Rasool, N. (2021a). Sequence-based identification of Arginine amidation sites in proteins using deep representations of proteins and PseAAC. Current Bioinformatics, 15(8), 937–948. https://doi.org/10.2174/1574893615666200129110450
  • Naseer, S., Hussain, W., Daanial Khan, Y., & Rasool, N. (2021b). Optimization of serine phosphorylation prediction in proteins by comparing human engineered features and deep representations. Analytical Biochemistry, 615, 114069. https://doi.org/10.1016/j.ab.2020.114069
  • Qiu, W.-R., Sun, B.-Q., Tang, H., Huang, J., & Lin, H. (2017). Identify and analysis crotonylation sites in histone by using support vector machines. Artificial Intelligence in Medicine, 83, 75–81. https://doi.org/10.1016/j.artmed.2017.02.007
  • Rasool, N., Iftikhar, S., Amir, A., & Hussain, W. (2018). Structural and quantum mechanical computations to elucidate the altered binding mechanism of metal and drug with pyrazinamidase from Mycobacterium tuberculosis due to mutagenicity. Journal of Molecular Graphics & Modelling, 80, 126–131. https://doi.org/10.1016/j.jmgm.2017.12.011
  • Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. https://doi.org/10.1038/323533a0
  • Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432
  • Song, J., Wang, Y., Li, F., Akutsu, T., Rawlings, N. D., Webb, G. I., & Chou, K.-C. (2018). iProt-Sub: A comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Briefings in Bioinformatics, 20(2), 638–658.
  • Sun, X., & Xu, W. (2014). Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Processing Letters, 21(11), 1389–1393. https://doi.org/10.1109/LSP.2014.2337313
  • Tan, M., Peng, C., Anderson, K. A., Chhoy, P., Xie, Z., Dai, L., Park, J., Chen, Y., Huang, H., Zhang, Y., Ro, J., Wagner, G. R., Green, M. F., Madsen, A. S., Schmiesing, J., Peterson, B. S., Xu, G., Ilkayeva, O. R., Muehlbauer, M. J. … Zhao, Y. (2014). Lysine glutarylation is a protein posttranslational modification regulated by SIRT5. Cell Metabolism, 19(4), 605–617. https://doi.org/10.1016/j.cmet.2014.03.014
  • The UniProt Consortium. (2019). UniProt: A worldwide hub of protein knowledge. Nucleic Acids Research, 47(D1), D506–D515.
  • Vacic, V., Iakoucheva, L. M., & Radivojac, P. (2006). Two sample logo: A graphical representation of the differences between two sets of sequence alignments. Bioinformatics, 22(12), 1536–1537. https://doi.org/10.1093/bioinformatics/btl151
  • van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.
  • Wang, L., Zhang, R., & Mu, Y. (2019). Fu-SulfPred: Identification of protein S-sulfenylation sites by fusing forests via Chou’s general PseAAC. Journal of Theoretical Biology, 461, 51–58. https://doi.org/10.1016/j.jtbi.2018.10.046
  • Wanhua, S., Yuan, Y., & Zhu, M. (2015). A relationship between the average precision and the area under the ROC curve. In Proceedings of the 2015 International Conference on the Theory of Information Retrieval (pp. 349–352).
  • Xu, Y., Yang, Y., Ding, J., & Li, C. (2018). iGlu-Lys: A predictor for lysine glutarylation through amino acid pair order features. IEEE Transactions on Nanobioscience, 17(4), 394–401. https://doi.org/10.1109/TNB.2018.2848673
  • Zhang, D., Xu, Z.-C., Su, W., Yang, Y.-H., Lv, H., Yang, H., & Lin, H. (2020). iCarPS: A computational tool for identifying protein carbonylation sites by novel encoded features. Bioinformatics, 37(2), 171–177.
  • Zhao, Y.-W., Lai, H.-Y., Tang, H., Chen, W., & Lin, H. (2016). Prediction of phosphothreonine sites in human proteins by fusing different features. Scientific Reports, 6, 34817. https://doi.org/10.1038/srep34817
  • Zhou, B., Du, Y., Xue, Y., Miao, G., Wei, T., & Zhang, P. (2020). Identification of malonylation, succinylation, and glutarylation in serum proteins of acute myocardial infarction patients. Proteomics – Clinical Applications, 14(1), 1900103. https://doi.org/10.1002/prca.201900103

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.