Abstract
In the first part of this paper, we present a novel graphical representation of proteins, which starts with constructing a map of a protein that is obtained from a matrix, the elements of which record the adjacencies of pairs of amino acids in the primary structure of a protein. Starting with the novel protein map, one interprets its matrix elements as vertices of a graph, which are labelled in sequential order as in the protein sequence. The nearest vertices are connected to the nearest neighbour which has a smaller label. In the second part of this paper, we describe the construction of protein binary codes that can serve as protein descriptors. This novel graphical representation of proteins is illustrated on segments of trans-membrane proteins, which are embedded in the membrane.
Acknowledgment
We wish to thank Professor A. T. Balaban (Texas A&M University at Galveston) for comments that improved the presentation of the results and MR wishes to thank the Laboratory of Chemometrics at the National Institute of Chemistry for cordial hospitality. This work has been supported by the Ministry of Higher Education, Science and Technology of the Republic of Slovenia under research grant P1-0017, and by the Ministry of Science, Education and Sport of the Republic of Croatia under the Project 098-0982929-2917.
Notes
Notes
1. We coined the term graphical bioinformatics in order to emphasize the distinction between the part of bioinformatics concerned with comparative studies of bio-sequences based on direct computer-driven comparisons of primary DNA and protein sequences and the part of bioinformatics dealing with graphical representations of DNA and proteins and their numerical characterization based on mathematical invariants extracted from graphical representations. Observe an important distinction between the two branches of bioinformatics: the former always consider simultaneously at least two sequences, while in graphical bioinformatics one can focus attention and characterize a single DNA, RNA, protein or proteome.
2. We may add that recently M. Randić recognized that provides the basic tool for very efficient search for protein alignment. Namely, by superimposing such tables for two proteins one can immediately extract which amino acid is aligned in any pair of proteins. See: M. Randić, Very Efficient Search for Protein Alignment – VESPA, J. Comput. Chem. 33 (2012), pp. 702–707.