Abstract
The evolution of the SARS-CoV2 coronavirus S-protein is studied using a mass spectrometry based protein phylogenetic approach (known as phylonumerics). This is achieved using mass maps generated for the surface S-protein across various strains, including new variants, to construct phylogenetic trees. The trees are built solely from these numerical datasets through a pairwise comparison of mass values from each protein set. Single point mutations are calculated from peptide mass differences across different sets and these are displayed at the branch nodes on the trees in a single step tree-building step. The topology of the trees is studied with different protein coverages and the mutations identified are compared with those derived from the sequence data. It is demonstrated that most non-synonymous mutations can be correctly identified from the mass data alone, thus avoiding the need for gene or protein sequencing, and any sequence alignment, that are required by other phylogenetic approaches.
Acknowledgements
Author Downard thanks Elma H. Akand for helpful advice concerning the application of the MassTree algorithm.
Disclosure statement
No potential competing interests are reported by the authors.