References
- Alakus, T. B., & Turkoglu, I. (2022a). A comparative study of amino acid encoding methods for predicting drug-target interactions in COVID-19 disease. In A. T. Azar, & A. E. Hassanien (Eds.), Studies in systems, decision and control (Vol. 366, pp. 619–643). Springer International Publishing. https://doi.org/10.1007/978-3-030-72834-2_18
- Alakus, T. B., & Turkoglu, I. (2022b). Prediction of viral-host interactions of COVID-19 by computational methods. Chemometrics and Intelligent Laboratory Systems: An International Journal Sponsored by the Chemometrics Society, 228(January), 104622. https://doi.org/10.1016/j.chemolab.2022.104622
- Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M., & Church, G. M. (2019). Unified rational protein engineering with sequence-based deep representation learning. Nature Methods, 16(12), 1315–1322. https://doi.org/10.1038/s41592-019-0598-1
- AlQuraishi, M. (2019). AlphaFold at CASP13. Bioinformatics (Oxford, England), 35(22), 4862–4865. https://doi.org/10.1093/bioinformatics/btz422
- Asgari, E., & Mofrad, M. R. K. (2015). Continuous distributed representation of biological sequences for deep proteomics and genomics. PloS One, 10(11), e0141287. https://doi.org/10.1371/journal.pone.0141287
- Atchley, W. R., Zhao, J., Fernandes, A. D., & Drüke, T. (2005). Solving the protein sequence metric problem. Proceedings of the National Academy of Sciences of the United States of America, 102(18), 6395–6400. https://doi.org/10.1073/pnas.0408677102
- Bordin, N., Dallago, C., Heinzinger, M., Kim, S., Littmann, M., Rauer, C., Steinegger, M., Rost, B., & Orengo, C. (2023). Novel machine learning approaches revolutionize protein knowledge. Trends in Biochemical Sciences, 48(4), 345–359. https://doi.org/10.1016/j.tibs.2022.11.001
- Castro-Chavez, F. (2016). Anatomical mnemonics of the genetic code: A functional icosahedron and the vigesimal system of the maya to represent the twenty proteinogenic amino acids. Journal of Biology and Nature, 5(3), 140–147.
- Cock, P. J. A., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., & de Hoon, M. J. L. (2009). Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics (Oxford, England), 25(11), 1422–1423. https://doi.org/10.1093/bioinformatics/btp163
- Dearden, J. C. (1991). The QSAR prediction of melting point, a property of environmental relevance. The Science of the Total Environment, 109-110(C), 59–68. https://doi.org/10.1016/0048-9697(91)90170-j
- Erten, M. (2024).MehNet Source Code and IDs, Mendeley Data, V1. https://doi.org/10.17632/24x9hdckx5.1
- Erten, M., Acharya, M. R., Kamath, A. P., Sampathila, N., Bairy, G. M., Aydemir, E., Barua, P. D., Baygin, M., Tuncer, I., Dogan, S., & Tuncer, T. (2022). Hamlet-pattern-based automated COVID-19 and influenza detection model using protein sequences. Diagnostics (Basel, Switzerland), 12(12), 3181. https://doi.org/10.3390/diagnostics12123181
- Erten, M., Aydemir, E., Barua, P. D., Baygin, M., Dogan, S., Tuncer, T., Tan, R.-S., Hafeez-Baig, A., & Rajendra Acharya, U. (2024). Novel tiny textural motif pattern-based RNA virus protein sequence classification model. Expert Systems with Applications, 242, 122781. https://doi.org/10.1016/j.eswa.2023.122781
- Esmaili, F., Pourmirzaei, M., Ramazi, S., Shojaeilangari, S., & Yavari, E. (2023). A review of machine learning and algorithmic methods for protein phosphorylation sites prediction. Genomics, Proteomics & Bioinformatics, https://doi.org/10.1016/j.gpb.2023.03.007
- Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., Del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2
- Jhamb, S., Liang, X., Gani, R., & Hukkerikar, A. S. (2018). Estimation of physical properties of amino acids by group-contribution method. Chemical Engineering Science, 175, 148–161. https://doi.org/10.1016/j.ces.2017.09.019
- Jing, X., Dong, Q., Hong, D., & Lu, R. (2020). Amino acid encoding methods for protein sequences: A comprehensive review and assessment. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(6), 1918–1931. https://doi.org/10.1109/TCBB.2019.2911677
- Katritzky, A. R., Jain, R., Lomaka, A., Petrukhin, R., Maran, U., & Karelson, M. (2001). Perspective on the relationship between melting points and chemical structure. Crystal Growth & Design, 1(4), 261–265. https://doi.org/10.1021/cg010009s
- Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., & Kanehisa, M. (2008). AAindex: Amino acid index database, progress report 2008. Nucleic Acids Research, 36(Database issue), D202–5. https://doi.org/10.1093/nar/gkm998
- McKinney, W. (2010). Data structures for statistical computing in Python. [Paper presentation]. In Proceedings of the 9th Python in Science Conference (pp. 56–61). https://doi.org/10.25080/Majora-92bf1922-00a
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings (pp. 1–12).
- Murray, R. K., Granner, D. K., Mayes, P. A., & Rodwell, V. W. (2003). Harper’s illustrated biochemistry (26th ed.). McGraw-Hill.
- Pearson, W. R. (2013). Selecting the right similarity-scoring matrix. Current Protocols in Bioinformatics, 43(1), 3.5.1–3.5.9. https://doi.org/10.1002/0471250953.bi0305s43
- Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J., Guo, D., Ott, M., Zitnick, C. L., Ma, J., & Fergus, R. (2021). Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences of the United States of America, 118(15), 1–46. https://doi.org/10.1073/pnas.2016239118
- Seife, C. (2000). Zero: The biography of a dangerous idea. In Zero: The biography of a dangerous idea. Penguin Group.
- UniProt Consortium. (2021). UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Research, 49(D1), D480–D489. https://doi.org/10.1093/nar/gkaa1100
- Wishart, D. S., Guo, A., Oler, E., Wang, F., Anjum, A., Peters, H., Dizon, R., Sayeeda, Z., Tian, S., Lee, B. L., Berjanskii, M., Mah, R., Yamamoto, M., Jovel, J., Torres-Calzada, C., Hiebert-Giesbrecht, M., Lui, V. W., Varshavi, D., Varshavi, D., … Gautam, V. (2022). HMDB 5.0: The Human Metabolome Database for 2022. Nucleic Acids Research, 50(D1), D622–D631. https://doi.org/10.1093/NAR/GKAB1062
- Xu, Y., Ding, Y.-X Y. X., Ding, J., Wu, L.-Y L. Y., & Xue, Y. (2016). Mal-Lys: Prediction of lysine malonylation sites in proteins integrated sequence-based features with mRMR feature selection. Scientific Reports, 6(1), 38318. https://doi.org/10.1038/srep38318
- Zhang, Y., Xie, R., Wang, J., Leier, A., Marquez-Lago, T. T., Akutsu, T., Webb, G. I., Chou, K.-C., & Song, J. (2019). Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework. Briefings in Bioinformatics, 20(6), 2185–2199. https://doi.org/10.1093/bib/bby079
- Zhao, J., Jiang, H., Zou, G., Lin, Q., Wang, Q., Liu, J., & Ma, L. (2022). CNNArginineMe: A CNN structure for training models for predicting arginine methylation sites based on the One-Hot encoding of peptide sequence. Frontiers in Genetics, 13(October), 1036862. https://doi.org/10.3389/fgene.2022.1036862