468
Views
4
CrossRef citations to date
0
Altmetric
Research Articles

Unique and exclusive peptide signatures directly identify intrinsically disordered proteins from sequences without structural information

ORCID Icon, &
Pages 2885-2893 | Received 19 Mar 2020, Accepted 13 Apr 2020, Published online: 27 Apr 2020
 

Abstract

Intrinsically disordered proteins are now widely accepted to play crucial roles in biological functions. Identification of signatures of intrinsic disorder is one of the key steps towards building a proper repertoire for their occurrence in proteomes. In this work, systematic computational synthesis of a library of all possible (3368400) dipeptides, tripeptides, tetrapeptides and pentapeptides using the natural 20 amino acids allowed us to identify 36 unique tetrapeptides present exclusively in intrinsically disordered proteins and absent in the complete primary sequence space of naturally occurring structured proteins. Further, out of more than 530000 known naturally occurring primary sequences without any structural information, 1349 sequences contain the above identified unique signatures of intrinsic disorder. These sequences, having cellular functions varying from housekeeping to metabolic to transport, more than double the number of the currently known intrinsically disordered proteins. On similar lines, we report that 26577 pentapeptide signatures exclusive to intrinsically disordered proteins, and absent in naturally occurring structured proteins, identify ∼50% of more than half-a-million curated protein sequences without structural information to be intrinsically disordered. The results reported are a major leap forward in exploring functional manifestations of intrinsically disordered proteins.

Communicated by Ramaswamy H. Sarma

Acknowledgements

AMC is grateful to IIT Delhi for fellowship support. The authors also thank IIT Delhi for providing access to the HPC facility. AM is grateful to Kusuma Trust (UK) for their generous funding support towards assisting him in establishing the teaching and research programs of the School of Biological Sciences (subsequently renamed as the Kusuma School of Biological Sciences) at IIT Delhi. AM is also grateful to Dept. of Biotechnology, Government of India and the National Supercomputing Mission, Government of India for their support to the Supercomputing Facility for Bioinformatics & Computational Biology at IIT Delhi.

Author contributions

AMC and ST collected the data. AMC collected the complete peptide count data and ST independently confirmed the dipeptide and tripeptide count data. AMC also analyzed some of the data. AM designed the study, analyzed the data, prepared the figures and wrote the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,074.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.