448
Views
6
CrossRef citations to date
0
Altmetric
Research Articles

DeEPn: a deep neural network based tool for enzyme functional annotation

, , &
Pages 2733-2743 | Received 19 Feb 2020, Accepted 02 Apr 2020, Published online: 22 Apr 2020

References

  • Abma, B. (2009). Evaluation of requirements management tools with support for traceability-based change impact analysis [Master’s thesis]. University of Twente.
  • Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., & Magrane, M. (2004). UniProt: The universal protein knowledgebase. Nucleic Acids Research, 32(90001), 115D–D119. 10.1093/nar/gkh131
  • Arakaki, A. K., Huang, Y., & Skolnick, J. (2009). EFICAz 2: Enzyme function inference by a combined approach enhanced by machine learning. BMC Bioinformatics, 10(1), 107. 10.1186/1471-2105-10-107
  • Barrett, A. J. (1997). Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). European Journal of Biochemistry, 250(1), 1.
  • Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PloS One, 12(6), e0177678. 10.1371/journal.pone.0177678
  • Cai, Y.-D., & Chou, K.-C. (2006). Predicting membrane protein type by functional domain composition and pseudo-amino acid composition. Journal of Theoretical Biology, 238(2), 395–400. 10.1016/j.jtbi.2005.05.035
  • Cai, C., Han, L., Ji, Z., & Chen, Y. (2004). Enzyme family classification by support vector machines. Proteins: Structure, Function, and Bioinformatics, 55(1), 66–76. 10.1002/prot.20045
  • Cai, C., Han, L., Ji, Z. L., Chen, X., & Chen, Y. Z. (2003). SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Research, 31(13), 3692–3697. 10.1093/nar/gkg600
  • Chou, K.-C., & Elrod, D. W. (2003). Prediction of enzyme family classes. Journal of Proteome Research, 2(2), 183–190. 10.1021/pr0255710
  • Cornish-Bowden, A. (2014). Current IUBMB recommendations on enzyme nomenclature and kinetics. Perspectives in Science, 1(1-6), 74–87. 10.1016/j.pisc.2014.02.006
  • Correa, L. D. L., & Dorn, M. (2018). A knowledge-based artificial bee colony algorithm for the 3-D protein structure prediction problem [Paper presentation]. 2018 IEEE Congress on Evolutionary Computation (CEC), IEEE. 10.1109/CEC.2018.8477863
  • Dalkiran, A., Rifaioglu, A. S., Martin, M. J., Cetin-Atalay, R., Atalay, V., & Doğan, T. (2018). ECPred: A tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature. BMC Bioinformatics, 19(1), 1–13. 10.1186/s12859-018-2368-y
  • De Ferrari, L., Aitken, S., van Hemert, J., & Goryanin, I. (2012). EnzML: Multi-label prediction of enzyme classes using InterPro signatures. BMC Bioinformatics, 13(1), 61. 10.1186/1471-2105-13-61
  • Dobson, P. D., & Doig, A. J. (2005). Predicting enzyme class from protein structure without alignments. Journal of Molecular Biology, 345(1), 187–199. 10.1016/j.jmb.2004.10.024
  • Espadaler, J., Eswar, N., Querol, E., Avilés, F. X., Sali, A., Marti-Renom, M. A., & Oliva, B. (2008). Prediction of enzyme function by combining sequence similarity and protein interactions. BMC Bioinformatics, 9(1), 249. 10.1186/1471-2105-9-249
  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. 10.1016/j.patrec.2005.10.010
  • Fenichel, G. M. (2007). Hypotonia, Arthrogryposis, and rigidity. Neonatal neurology (4th ed. pp. 37–68). Churchill Livingston.
  • Goddard, J.-P., & Reymond, J.-L. (2004). Enzyme assays for high-throughput screening. Current Opinion in Biotechnology, 15(4), 314–322. 10.1016/j.copbio.2004.06.008
  • Godzik, A. (2011). Metagenomics and the protein universe. Current Opinion in Structural Biology, 21(3), 398–403. 10.1016/j.sbi.2011.03.010
  • Goedde, H., Agarwal, D., Harada, S., Meier-Tackmann, D., Ruofu, D., Bienzle, U., Kroeger, A., & Hussein, L. (1983). Population genetic studies on aldehyde dehydrogenase isozyme deficiency and alcohol sensitivity. American Journal of Human Genetics , 35(4), 769–772.
  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.
  • Huang, W.-L., Chen, H.-M., Hwang, S.-F., & Ho, S.-Y. (2007). Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method. Biosystems, 90(2), 405–413. 10.1016/j.biosystems.2006.10.004
  • Huang, Y., Niu, B., Gao, Y., Fu, L., & Li, W. (2010). CD-HIT Suite: A web server for clustering and comparing biological sequences. Bioinformatics, 26(5), 680–682. 10.1093/bioinformatics/btq003
  • Huang, C.-C., Lin, C.-Y., Chang, C.-W., & Tang, C. Y. (2012). Automatic Prediction of Enzyme Functions from Domain Compositions Using Enzyme Reaction Prediction Scheme[Paper presentation]. 2012 International Conference on Biomedical Engineering and Biotechnology, IEEE.
  • Jiang, Z., Kumar, M., Padula, M. P., Pernice, M., Kahlke, T., Kim, M., & Ralph, P. J. (2017). Development of an efficient protein extraction method compatible with LC-MS/MS for proteome mapping in two Australian seagrasses Zostera muelleri and Posidonia australis. Frontiers in Plant Science, 8, 1416. 10.3389/fpls.2017.01416
  • Kern, A. D., Oliveira, M. A., Coffino, P., & Hackert, M. L. (1999). Structure of mammalian ornithine decarboxylase at 1.6 Å resolution: Stereochemical implications of PLP-dependent amino acid decarboxylases. Structure, 7(5), 567–581. 10.1016/S0969-2126(99)80073-2
  • Kumar, C., & Choudhary, A. (2012). A top-down approach to classify enzyme functional classes and sub-classes using random forest. EURASIP Journal on Bioinformatics and Systems Biology, 2012, 1. 10.1186/1687-4153-2012-1
  • Kumar, N., & Skolnick, J. (2012). EFICAz2. 5: Application of a high-precision enzyme function predictor to 396 proteomes. Bioinformatics, 28(20), 2687–2688. 10.1093/bioinformatics/bts510
  • Lee, B. J., Lee, H. G., & Ryu, K. H. (2008). Design of a novel protein feature and enzyme function classification [Paper presentation]. 2008 IEEE 8th International Conference on Computer and Information Technology Workshops, IEEE.
  • Li, W., & Godzik, A. (2006). Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 22(13), 1658–1659. 10.1093/bioinformatics/btl158
  • Li, Y., Wang, S., Umarov, R., Xie, B., Fan, M., Li, L., & Gao, X. (2018). DEEPre: Sequence-based enzyme EC number prediction by deep learning. Bioinformatics, 34(5), 760–769. 10.1093/bioinformatics/btx680
  • Li, Y. H., Xu, J. Y., Tao, L., Li, X. F., Li, S., Zeng, X., Chen, S. Y., Zhang, P., Qin, C., Zhang, C., Chen, Z., Zhu, F., & Chen, Y. Z. (2016). SVM-Prot 2016: A web-server for machine learning prediction of protein functional families from sequence irrespective of similarity. PloS One, 11(8), e0155290. 10.1371/journal.pone.0155290
  • Lu, L., Qian, Z., Cai, Y.-D., & Li, Y. (2007). ECS: An automatic enzyme classifier based on functional domain composition. Computational Biology and Chemistry, 31(3), 226–232. 10.1016/j.compbiolchem.2007.03.008
  • Mahan, F. R., Bailey, L., Pilipenko, V., & Prada, C. (2019). Pain and fatigue associated with generalized joint hypermobility in Gaucher disease. Molecular Genetics and Metabolism, 126(2), S97. 10.1016/j.ymgme.2018.12.242
  • Manning, G., Whyte, D. B., Martinez, R., Hunter, T., & Sudarsanam, S. (2002). The protein kinase complement of the human genome. Science, 298(5600), 1912–1934. 10.1126/science.1075762
  • Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2011). Stacked convolutional auto-encoders for hierarchical feature extraction [Paper presentation]. International Conference on Artificial Neural Networks, Springer.
  • Medicine, N. L. o. (2000). Medical subject headings. US Department of Health and Human Services, Public Health Service, National.
  • Moss, G. P. (2010). Enzyme nomenclature. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes by the Reactions they Catalyse.
  • Nagao, C., Nagano, N., & Mizuguchi, K. (2014). Prediction of detailed enzyme functions and identification of specificity determining residues by random forests. PloS One. , 9(1), e84623. 10.1371/journal.pone.0084623
  • Nasibov, E., & Kandemir-Cavas, C. (2009). Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction. Computational Biology and Chemistry, 33(6), 461–464. 10.1016/j.compbiolchem.2009.09.002
  • Pascal, J. M., O'Brien, P. J., Tomkinson, A. E., & Ellenberger, T. (2004). Human DNA ligase I completely encircles and partially unwinds nicked DNA. Nature, 432(7016), 473–478. 10.1038/nature03082
  • Pastores, G. M., & Hughes, D. A. (2018). Gaucher disease. GeneReviews®[Internet]. University of Washington.
  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., & Dubourg, V. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
  • Pritlove, D., Gu, M., Boyd, C., Randeva, H., & Vatish, M. (2006). Novel placental expression of 2, 3-bisphosphoglycerate mutase. Placenta, 27(8), 924–927. 10.1016/j.placenta.2005.08.010
  • Qiu, J.-D., Luo, S.-H., Huang, J.-H., & Liang, R.-P. (2009). Using support vector machines to distinguish enzymes: Approached by incorporating wavelet transform. Journal of Theoretical Biology, 256(4), 625–631. 10.1016/j.jtbi.2008.10.026
  • Quester, S., & Schomburg, D. (2011). EnzymeDetector: An integrated enzyme function prediction tool and database. BMC Bioinformatics, 12(1), 376. 10.1186/1471-2105-12-376
  • Rost, B. (2002). Enzyme function less conserved than anticipated. Journal of Molecular Biology, 318(2), 595–608. 10.1016/S0022-2836(02)00016-5
  • Rotticci, D., Rotticci‐Mulder, J. C., Denman, S., Norin, T., & Hult, K. (2001). Improved enantioselectivity of a lipase by rational protein engineering. ChemBioChem, 2(10), 766–770. 10.1002/1439-7633(20011001)2:10<766::AID-CBIC766>3.0.CO;2-K
  • Roy, A., Yang, J., & Zhang, Y. (2012). COFACTOR: An accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Research, 40(W1), W471–W477. 10.1093/nar/gks372
  • Semwal, R., Aier, I., Raj, U., & Varadwaj, P. K. (2017). Pharmadoop: A tool for pharmacophore searching using Hadoop framework. Network Modeling Analysis in Health Informatics and Bioinformatics, 6(1), 20. 10.1007/s13721-017-0161-x
  • Semwal, V. B., Mondal, K., & Nandi, G. C. (2017). Robust and accurate feature selection for humanoid push recovery and classification: Deep learning approach. Neural Computing and Applications, 28(3), 565–574. 10.1007/s00521-015-2089-3
  • Sharif, M. M., Thrwat, A., Amin, I. I., Ella, A., & Hefeny, H. A. (2015). Enzyme function classification based on sequence alignment. Information Systems Design and Intelligent Applications (pp. 409–418). Springer.
  • Shen, H.-B., & Chou, K.-C. (2007). EzyPred: A top–down approach for predicting enzyme functional classes and subclasses. Biochemical and Biophysical Research Communications, 364(1), 53–59. 10.1016/j.bbrc.2007.09.098
  • Shu, L., Xu, H., & Liu, B. (2017). Doc: Deep open classification of text documents. arXiv preprint arXiv:1709.08716.
  • Team, R. C. (2013). R: A language and environment for statistical computing.
  • Tian, W., Arakaki, A. K., & Skolnick, J. (2004). EFICAz: A comprehensive approach for accurate genome-scale enzyme function inference. Nucleic Acids Research, 32(21), 6226–6239. 10.1093/nar/gkh956
  • Tian, W., & Skolnick, J. (2003). How well is enzyme function conserved as a function of pairwise sequence identity? Journal of Molecular Biology, 333(4), 863–882. 10.1016/j.jmb.2003.08.057
  • Valverde-Albacete, F. J., Carrillo-de-Albornoz, J., & Peláez-Moreno, C. (2013). A proposal for new evaluation metrics and result visualization technique for sentiment analysis tasks [Paper presentation]. International Conference of the Cross-Language Evaluation Forum for European Languages, Springer.
  • Van Asch, V. (2013). Macro-and micro-averaged evaluation measures [[basic draft]]. Belgium: CLiPS, 49.
  • Volpato, V., Adelfio, A., & Pollastri, G. (2013). Accurate prediction of protein enzymatic class by N-to-1 Neural Networks. BMC Bioinformatics, 14(Suppl 1), S11. 10.1186/1471-2105-14-S1-S11
  • Wang, Y.-C., Wang, X.-B., Yang, Z.-X., & Deng, N.-Y. (2010). Prediction of enzyme subfamily class via pseudo amino acid composition by incorporating the conjoint triad feature. Protein & Peptide Letters, 17(11), 1441–1449. 10.2174/0929866511009011441
  • Wang, Y.-C., Wang, Y., Yang, Z.-X., & Deng, N.-Y. (2011). Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context. BMC Systems Biology, 5(Suppl 1), S6. 10.1186/1752-0509-5-S1-S6
  • Webb, E. C. (1992). Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. Academic Press.
  • Xiao, N., Cao, D.-S., Zhu, M.-F., & Xu, Q.-S. (2015). protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics, 31(11), 1857–1859. 10.1093/bioinformatics/btv042
  • Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., & Zhang, Y. (2015). The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12(1), 7–8. 10.1038/nmeth.3213
  • Yu, C., Zavaljevski, N., Desai, V., & Reifman, J. (2009). Genome‐wide enzyme annotation with precision control: Catalytic families (CatFam) databases. Proteins: Structure, Function, and Bioinformatics, 74(2), 449–460. 10.1002/prot.22167
  • Zeng, D., Liu, K., Lai, S., Zhou, G., & Zhao, J. (2014). Relation classification via convolutional deep neural network.
  • Zhang, C., Freddolino, P. L., & Zhang, Y. (2017). COFACTOR: Improved protein function prediction by combining structure, sequence and protein–protein interaction information. Nucleic Acids Research, 45(W1), W291–W299. 10.1093/nar/gkx366
  • Zhou, X.-B., Chen, C., Li, Z.-C., & Zou, X.-Y. (2007). Using Chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. Journal of Theoretical Biology, 248(3), 546–551. 10.1016/j.jtbi.2007.06.001
  • Zou, H.-L., & Xiao, X. (2016). Classifying multifunctional enzymes by incorporating three different models into chou’s general pseudo amino acid composition. The Journal of Membrane Biology, 249(4), 551–557. 10.1007/s00232-016-9904-3

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.