70
Views
4
CrossRef citations to date
0
Altmetric
Research Articles

Protein sequence comparison based on representation on a finite dimensional unit hypercube

, , , &
Pages 6425-6439 | Received 16 May 2023, Accepted 01 Jul 2023, Published online: 14 Oct 2023

References

  • Abo-Elkhier, M. M., Abd Elwahaab, M. A., & Abo El Maaty, M. I. (2019). Measuring similarity among protein sequences using a new descriptor. BioMed Research International, 2019, 2796971–2796910. https://doi.org/10.1155/2019/2796971
  • Akhtar, M., Epps, J., & Ambikairajah, E. (2008). Signal processing in sequence analysis: Advances in eukaryotic gene prediction. IEEE Journal of Selected Topics in Signal Processing, 2(3), 310–321. https://doi.org/10.1109/JSTSP.2008.923854
  • Bernard, G., Chan, C. X., Chan, Y.-B., Chua, X.-Y., Cong, Y., Hogan, J. M., Maetschke, S. R., & Ragan, M. A. (2019). Alignment-free inference of hierarchical and reticulate phylogenomic relationships. Briefings in Bioinformatics, 20(2), 426–435. https://doi.org/10.1093/bib/bbx067
  • Chakravarthy, N., Spanias, A., Iasemidis, L. D., & Tsakalis, K. (2004). Autoregressive modeling and feature analysis of DNA sequences. EURASIP Journal on Advances in Signal Processing, 2004(1), 952689. https://doi.org/10.1155/S111086570430925X
  • Chi, R., & Ding, K. (2005). Novel 4D numerical representation of DNA sequences. Chemical Physics Letters, 407(1–3), 63–67. https://doi.org/10.1016/j.cplett.2005.03.056
  • Das, S., Deb, T., Dey, N., Ashour, A. S., Bhattacharya, D. K., & Tibarewala, D. N. (2018). Optimal choice of k-mer in composition vector method for genome sequence comparison. Genomics, 110(5), 263–273. https://doi.org/10.1016/j.ygeno.2017.11.003
  • Felsenstein, J. (2005). PHYLIP (phylogeny inference package) distributed by the author (Version, 3). Department of Genome Science, University of Washington.
  • Ghosh, S., Pal, J., & Maji, B. (2022). Bhattacharya D.K.A method of genome sequence comparison based on a new form of fuzzy polynucleotide space [Paper presentation]. 7th International Conference on Emerging Applications of Information Technology. https://doi.org/10.1007/978-981-19-5191-6_11
  • Ghosh, S., Pal, J., Maji, B., & Bhattacharya, D. K. (2018). A sequential development towards a unified approach to protein sequence comparison based on classified groups of amino acids. International Journal of Engineering & Technology, 7(2), 678–686. https://doi.org/10.14419/ijet.v7i2.9546
  • Ghosh, S., Pal, J., Maji, B., & Bhattacharya, D. K. (2021). Protein sequence comparison on fuzzy matrix amino acid space [Paper presentation]. International Conference on Technology Advanced Innovation. IEEE. https://doi.org/10.1109/ICTAI53825.2021.9673411
  • Hou, W., Pan, Q., & He, M. (2016). A new graphical representation of protein sequences and its applications. Physica A: Statistical Mechanics and Its Applications, 444, 996–1002. https://doi.org/10.1016/j.physa.2015.10.067
  • Li, C., Xing, L., & Wang, X. (2008). 2-D graphical representation of protein sequences and its application to coronavirus phylogeny. BMB Reports, 41(3), 217–222. https://doi.org/10.5483/bmbrep.2008.41.3.217
  • Li, C., Zhao, J., Wang, C., & Yao, Y. (2018). Protein sequence comparison and DNA-binding protein identification with generalized PseAAC and graphical representation. Combinatorial Chemistry & High Throughput Screening, 21(2), 100–110. https://doi.org/10.2174/1386207321666180130100838
  • Li, Y., Tian, K., Yin, C., He, R. L., & Yau, S. S. (2016). Virus classification in 60-dimensional protein space. Molecular Phylogenetics and Evolution, 99, 53–62. https://doi.org/10.1016/j.ympev.2016.03.009
  • Liu, Y. X., Li, D., Lu, K., Jiao, Y. D., & He, P. A. (2013). P–H curve, a graphical representation of protein sequences for similarities analysis. MATCH, 70(1), 451–466.
  • Ma, T., Liu, Y., Dai, Q., Yao, Y., & He, P. A. (2014). A graphical representation of protein based on a novel iterated function system. Physica A: Statistical Mechanics and Its Applications, 403, 21–28. https://doi.org/10.1016/j.physa.2014.01.067
  • Mahmoodi-Reihani, M., Abbasitabar, F., & Zare-Shahabadi, V. (2018). A novel graphical representation and similarity analysis of protein sequences based on physicochemical properties. Physica A: Statistical Mechanics and Its Applications, 510, 477–485. https://doi.org/10.1016/j.physa.2018.07.011
  • Mullick, B., Magar, R., Jhunjhunwala, A., & Farimani, A. B. (2021). Understanding mutation hotspots for the SARS-CoV-2 spike protein using Shannon Entropy and K-means clustering. Computers in Biology and Medicine, 138, 104915. https://doi.org/10.1016/j.compbiomed.2021.104915
  • Nieto, J. J., Torres, A., & Vázquez-Trasande, M. M. (2003). A metric space to study differences between polynucleotides. Applied Mathematics Letters, 16(8), 1289–1294. https://doi.org/10.1016/S0893-9659(03)90131-5
  • Pal, J., Ghosh, S., Maji, B., & Bhattacharya, D. K. (2016). Use of FFT in protein sequence comparison under their binary representations. Computational Molecular Bioscience, 06(02), 33–40. https://doi.org/10.4236/cmb.2016.62003
  • Pal, J., Ghosh, S., Maji, B., & Bhattacharya, D. K. (2021). A unique approach for comparison of protein sequence using PCA analysis [Paper presentation]. International Conference on Technology Advanced Innovation. IEEE. https://doi.org/10.1109/ICTAI53825.2021.9673245
  • Pal, J., Ghosh, S., Maji, B., & Bhattacharya, D. K. (2022a). Mathematical approach to protein sequence comparison based on physiochemical properties. ACS Omega, 7(43), 39446–39455. https://doi.org/10.1021/acsomega.2c06103
  • Pal, J., Ghosh, S., Maji, B., & Bhattacharya, D. K. (2022b). Similarity study of spike protein of corona virus by PCA using physical properties of amino acids [Paper presentation]. 7th International Conference on Emerging Applications of Information Technology.
  • Ping, P., Zhu, X., & Wang, L. (2017). Similarities/dissimilarities analysis of protein sequences based on PCA-FFT. Journal of Biological Systems, 25(01), 29–45. https://doi.org/10.1142/S0218339017500024
  • Qi, Z. H., & Fan, T. R. (2007). PN-curve: A 3D graphical representation of DNA sequences and their numerical characterization. Chemical Physics Letters, 442(4–6), 434–440. https://doi.org/10.1016/j.cplett.2007.06.029
  • Qi, Z. H., Jin, M. Z., Li, S. L., & Feng, J. (2015). A protein mapping method based on physicochemical properties and dimension reduction. Computers in Biology and Medicine, 57, 1–7. https://doi.org/10.1016/j.compbiomed.2014.11.012
  • Randić, M. (2007). 2-D graphical representation of proteins based on physicochemical properties of amino acids. Chemical Physics Letters, 440(4–6), 291–295. https://doi.org/10.1016/j.cplett.2007.04.037
  • Randić, M., Vračko, M., Lerš, N., & Plavšić, D. (2003a). Analysis of similarity/ dissimilarity of DNA sequences based on novel 2-D graphical representation. Chemical Physics Letters, 371(1-2), 202–207. https://doi.org/10.1016/S0009-2614(03)00244-6
  • Randić, M., Vračko, M., Lerš, N., & Plavšić, D. (2003b). Novel 2-D graphical representation of DNA sequences and their numerical characterization. Chemical Physics Letters, 368(1–2), 1–6. https://doi.org/10.1016/S0009-2614(02)01784-0
  • Randić, M., Vracko, M., Nandy, A., & Basak, S. C. (2000). On 3-D graphical representation of DNA primary sequences and their numerical characterization. Journal of Chemical Information and Computer Sciences, 40(5), 1235–1244. https://doi.org/10.1021/ci000034q
  • Randić, M., Witzmann, F., Vračko, M., & Basak, S. C. (2001). On characterization of proteomics maps and chemically induced changes in proteomes using matrix invariants: Application to peroxisome proliferators. Medicinal Chemistry Research, 10(7–8), 456–479.
  • Sadegh-Zadeh, K. (2007). Fuzzy genomes. Artificial Intelligence in Medicine, 41(1), 69–80. https://doi.org/10.1016/j.artmed.2007.04.006
  • Saw, A. K., Tripathy, B. C., & Nandi, S. (2019). Alignment-free similarity analysis for protein sequences based on fuzzy integral. Scientific Reports, 9(1), 2775. https://doi.org/10.1038/s41598-019-39477-8
  • Suna, D., Xua, C., & Zhanga, Y. (2016). Novel method of 2D graphical representation for proteins and its application. RNA, 18, 20.
  • Tamura, K., Stecher, G., & Kumar, S. (2021). MEGA11: Molecular evolutionary genetics analysis version 11. Molecular Biology and Evolution, 38(7), 3022–3027. https://doi.org/10.1093/molbev/msab120
  • Wen, J., & Zhang, Y. (2009). A 2D graphical representation of protein sequence and its numerical characterization. Chemical Physics Letters, 476(4–6), 281–286. https://doi.org/10.1016/j.cplett.2009.06.017
  • Wu, Z. C., Xiao, X., & Chou, K. C. (2010). 2D-MH: A web-server for generating graphic representation of protein sequences based on the physicochemical properties of their constituent amino acids. Journal of Theoretical Biology, 267(1), 29–34. https://doi.org/10.1016/j.jtbi.2010.08.007
  • Yao, Y. H., Dai, Q., Li, L., Nan, X. Y., He, P. A., & Zhang, Y. Z. (2010). Similarity/dissimilarity studies of protein sequences based on a new 2D graphical representation. Journal of Computational Chemistry, 31(5), 1045–1052. https://doi.org/10.1002/jcc.21391
  • Yao, Y. H., Kong, F., Dai, Q., & He, P. A. (2013). A sequence-segmented method applied to the similarity analysis of long protein sequence. MATCH, 70(1), 431–450.
  • Yao, Y. H., Nan, X. Y., & Wang, T. M. (2006). A new 2D graphical representation-classification curve and the analysis of similarity/dissimilarity of DNA sequences. Journal of Molecular Structure: Theochem, 764(1–3), 101–108. https://doi.org/10.1016/j.theochem.2006.02.007
  • Yau, S. S., Yu, C., & He, R. (2008). A protein map and its application. DNA and Cell Biology, 27(5), 241–250. PMID: 18348704. https://doi.org/10.1089/dna.2007.0676
  • Yu, C., Cheng, S. Y., He, R. L., & Yau, S. S. (2011). Protein map: An alignment-free sequence comparison method based on various properties of amino acids. Gene, 486(1–2), 110–118. https://doi.org/10.1016/j.gene.2011.07.002
  • Yu, C., Deng, M., Cheng, S. Y., Yau, S. C., He, R. L., & Yau, S. S. (2013). Protein space: A natural method for realizing the nature of protein universe. Journal of Theoretical Biology, 318, 197–204. https://doi.org/10.1016/j.jtbi.2012.11.005
  • Yu, J.-F., Qu, A., Tang, H.-C., Wang, F.-H., Wang, C.-L., Wang, H.-M., Wang, J.-H., & Zhu, H.-Q. (2019). A novel numerical model for protein sequences analysis based on spherical coordinates and multiple physicochemical properties of amino acids. Biopolymers, 110(8), e23282. https://doi.org/10.1002/bip.23282
  • Yu, L., Zhang, Y., Gutman, I., Shi, Y., & Dehmer, M. (2017). Protein sequence comparison based on physicochemical properties and the position-feature energy matrix. Scientific Reports, 7(1), 46237. https://doi.org/10.1038/srep46237
  • Yu, Z. G., Anh, V., & Lau, K. S. (2004). Chaos game representation of protein sequences based on the detailed HP model and their multifractal and correlation analyses. Journal of Theoretical Biology, 226(3), 341–348. https://doi.org/10.1016/j.jtbi.2003.09.009
  • Zhang, Y. P., Ruan, J. S., & He, P. A. (2013). Analyzes of the similarities of protein sequences based on the pseudo amino acid composition. Chemical Physics Letters, 590, 239–244. https://doi.org/10.1016/j.cplett.2013.10.076
  • Zielezinski, A., Vinga, S., Almeida, J., & Karlowski, W. M. (2017). Alignment-free sequence comparison: Benefits, applications, and tools. Genome Biology, 18(1), 186. https://doi.org/10.1186/s13059-017-1319-7

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.