Abstract
In this paper, we propose a nongraphical representation for protein secondary structures. By counting the frequency of occurrence of all possible four-tuples (i.e., four-letter words) of a protein secondary structure sequence, we construct a set of 3 × 3 matrices for the corresponding protein secondary structure sequence. Furthermore, the leading eigenvalues of these matrices are computed and considered as invariants for the protein secondary structure sequences. To illustrate the utility of our approach, we apply it to a set of real data to distinguish protein structural classes. The result indicates that it can be used to complement the classification of protein secondary structures.