Abstract
Computational characterization of multiple Histidine (His) post-translational-modifications (PTM) at enzyme active sites complements tedious experimental characterization in proteins-of-unknown-functions (PUFs) and domain-of-unknown-functions (DUFs). There are only a handful of Histidine-PTM-prediction-tools and those also annotate only a single function. Here, we addressed the problem using artificial neural networks on functional histidine dataset curated from enzyme (protein) sequences available in UniProt database (sample size n = 1584). The convolution-neural-network (CNN) model (‘Hist-i-fy’) performed the best with 75% overall accuracy/F1-score. A case study was performed on histidine-phosphorylation (n = 34) obtained from mass spectroscopy data. For the first time, we report multiple His-PTM-prediction-tool (https://histify.streamlit.app/& https://github.com/dibyansu24-maker/Histify), with optimal performance. The inputs to the tool are (i) protein sequence containing histidine, and (ii) the histidine residue number. Prediction output is one out of the eight histidine functions—acetylation, ribosylation, glycosylation, hydroxylation, methylation, oxidation, phosphorylation, and protein splicing.
Communicated by Ramaswamy H. Sarma
Disclosure statement
No potential conflict of interest was reported by the author(s).