130
Views
0
CrossRef citations to date
0
Altmetric
Research Article

MehNet: a vigesimal-based model by amino acid melting points generates unique ID numbers for protein sequences

ORCID Icon
Received 24 Oct 2023, Accepted 02 Jan 2024, Published online: 17 Jan 2024

References

  • Alakus, T. B., & Turkoglu, I. (2022a). A comparative study of amino acid encoding methods for predicting drug-target interactions in COVID-19 disease. In A. T. Azar, & A. E. Hassanien (Eds.), Studies in systems, decision and control (Vol. 366, pp. 619–643). Springer International Publishing. https://doi.org/10.1007/978-3-030-72834-2_18
  • Alakus, T. B., & Turkoglu, I. (2022b). Prediction of viral-host interactions of COVID-19 by computational methods. Chemometrics and Intelligent Laboratory Systems: An International Journal Sponsored by the Chemometrics Society, 228(January), 104622. https://doi.org/10.1016/j.chemolab.2022.104622
  • Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M., & Church, G. M. (2019). Unified rational protein engineering with sequence-based deep representation learning. Nature Methods, 16(12), 1315–1322. https://doi.org/10.1038/s41592-019-0598-1
  • AlQuraishi, M. (2019). AlphaFold at CASP13. Bioinformatics (Oxford, England), 35(22), 4862–4865. https://doi.org/10.1093/bioinformatics/btz422
  • Asgari, E., & Mofrad, M. R. K. (2015). Continuous distributed representation of biological sequences for deep proteomics and genomics. PloS One, 10(11), e0141287. https://doi.org/10.1371/journal.pone.0141287
  • Atchley, W. R., Zhao, J., Fernandes, A. D., & Drüke, T. (2005). Solving the protein sequence metric problem. Proceedings of the National Academy of Sciences of the United States of America, 102(18), 6395–6400. https://doi.org/10.1073/pnas.0408677102
  • Bordin, N., Dallago, C., Heinzinger, M., Kim, S., Littmann, M., Rauer, C., Steinegger, M., Rost, B., & Orengo, C. (2023). Novel machine learning approaches revolutionize protein knowledge. Trends in Biochemical Sciences, 48(4), 345–359. https://doi.org/10.1016/j.tibs.2022.11.001
  • Castro-Chavez, F. (2016). Anatomical mnemonics of the genetic code: A functional icosahedron and the vigesimal system of the maya to represent the twenty proteinogenic amino acids. Journal of Biology and Nature, 5(3), 140–147.
  • Cock, P. J. A., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., & de Hoon, M. J. L. (2009). Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics (Oxford, England), 25(11), 1422–1423. https://doi.org/10.1093/bioinformatics/btp163
  • Dearden, J. C. (1991). The QSAR prediction of melting point, a property of environmental relevance. The Science of the Total Environment, 109-110(C), 59–68. https://doi.org/10.1016/0048-9697(91)90170-j
  • Erten, M. (2024).MehNet Source Code and IDs, Mendeley Data, V1. https://doi.org/10.17632/24x9hdckx5.1
  • Erten, M., Acharya, M. R., Kamath, A. P., Sampathila, N., Bairy, G. M., Aydemir, E., Barua, P. D., Baygin, M., Tuncer, I., Dogan, S., & Tuncer, T. (2022). Hamlet-pattern-based automated COVID-19 and influenza detection model using protein sequences. Diagnostics (Basel, Switzerland), 12(12), 3181. https://doi.org/10.3390/diagnostics12123181
  • Erten, M., Aydemir, E., Barua, P. D., Baygin, M., Dogan, S., Tuncer, T., Tan, R.-S., Hafeez-Baig, A., & Rajendra Acharya, U. (2024). Novel tiny textural motif pattern-based RNA virus protein sequence classification model. Expert Systems with Applications, 242, 122781. https://doi.org/10.1016/j.eswa.2023.122781
  • Esmaili, F., Pourmirzaei, M., Ramazi, S., Shojaeilangari, S., & Yavari, E. (2023). A review of machine learning and algorithmic methods for protein phosphorylation sites prediction. Genomics, Proteomics & Bioinformatics, https://doi.org/10.1016/j.gpb.2023.03.007
  • Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., Del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2
  • Jhamb, S., Liang, X., Gani, R., & Hukkerikar, A. S. (2018). Estimation of physical properties of amino acids by group-contribution method. Chemical Engineering Science, 175, 148–161. https://doi.org/10.1016/j.ces.2017.09.019
  • Jing, X., Dong, Q., Hong, D., & Lu, R. (2020). Amino acid encoding methods for protein sequences: A comprehensive review and assessment. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(6), 1918–1931. https://doi.org/10.1109/TCBB.2019.2911677
  • Katritzky, A. R., Jain, R., Lomaka, A., Petrukhin, R., Maran, U., & Karelson, M. (2001). Perspective on the relationship between melting points and chemical structure. Crystal Growth & Design, 1(4), 261–265. https://doi.org/10.1021/cg010009s
  • Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., & Kanehisa, M. (2008). AAindex: Amino acid index database, progress report 2008. Nucleic Acids Research, 36(Database issue), D202–5. https://doi.org/10.1093/nar/gkm998
  • McKinney, W. (2010). Data structures for statistical computing in Python. [Paper presentation]. In Proceedings of the 9th Python in Science Conference (pp. 56–61). https://doi.org/10.25080/Majora-92bf1922-00a
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings (pp. 1–12).
  • Murray, R. K., Granner, D. K., Mayes, P. A., & Rodwell, V. W. (2003). Harper’s illustrated biochemistry (26th ed.). McGraw-Hill.
  • Pearson, W. R. (2013). Selecting the right similarity-scoring matrix. Current Protocols in Bioinformatics, 43(1), 3.5.1–3.5.9. https://doi.org/10.1002/0471250953.bi0305s43
  • Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J., Guo, D., Ott, M., Zitnick, C. L., Ma, J., & Fergus, R. (2021). Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences of the United States of America, 118(15), 1–46. https://doi.org/10.1073/pnas.2016239118
  • Seife, C. (2000). Zero: The biography of a dangerous idea. In Zero: The biography of a dangerous idea. Penguin Group.
  • UniProt Consortium. (2021). UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Research, 49(D1), D480–D489. https://doi.org/10.1093/nar/gkaa1100
  • Wishart, D. S., Guo, A., Oler, E., Wang, F., Anjum, A., Peters, H., Dizon, R., Sayeeda, Z., Tian, S., Lee, B. L., Berjanskii, M., Mah, R., Yamamoto, M., Jovel, J., Torres-Calzada, C., Hiebert-Giesbrecht, M., Lui, V. W., Varshavi, D., Varshavi, D., … Gautam, V. (2022). HMDB 5.0: The Human Metabolome Database for 2022. Nucleic Acids Research, 50(D1), D622–D631. https://doi.org/10.1093/NAR/GKAB1062
  • Xu, Y., Ding, Y.-X Y. X., Ding, J., Wu, L.-Y L. Y., & Xue, Y. (2016). Mal-Lys: Prediction of lysine malonylation sites in proteins integrated sequence-based features with mRMR feature selection. Scientific Reports, 6(1), 38318. https://doi.org/10.1038/srep38318
  • Zhang, Y., Xie, R., Wang, J., Leier, A., Marquez-Lago, T. T., Akutsu, T., Webb, G. I., Chou, K.-C., & Song, J. (2019). Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework. Briefings in Bioinformatics, 20(6), 2185–2199. https://doi.org/10.1093/bib/bby079
  • Zhao, J., Jiang, H., Zou, G., Lin, Q., Wang, Q., Liu, J., & Ma, L. (2022). CNNArginineMe: A CNN structure for training models for predicting arginine methylation sites based on the One-Hot encoding of peptide sequence. Frontiers in Genetics, 13(October), 1036862. https://doi.org/10.3389/fgene.2022.1036862

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.