ABSTRACT
The concept of proteogenomics has emerged rapidly as a valuable approach to integrate mass spectrometry-derived proteomic data with genomic and transcriptomic data. It is used to harness the full potential of the former dataset in the discovery of potential biomarkers, therapeutic targets and novel proteins associated with various biological processes including diseases. Proteogenomic strategies have been successfully utilized to identify novel genes and redefine annotation of existing gene models in various genomes. In recent years, this approach has been extended to the field of cancer biology to unravel complexities in the tumor genomes and proteomes. Standard proteomics workflows employing translated cancer genomes and transcriptomes can potentially identify peptides from mutant proteins, splice variants and fusion proteins in the tumor proteome, which in addition to the currently available biomarker panels can serve as potential diagnostic and prognostic biomarkers, besides having therapeutic utility. This review focuses on the role of proteogenomics to understand cancer biology.
Financial and competing interests disclosure
The author has no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
Key issues
Proteogenomics has advanced rapidly as an important platform to integrate genomic, transcriptomic with mass spectrometry-derived proteomic data.
Proteogenomic strategies have been routinely used to redefine genome assembly and annotation in several species. Proteogenomic strategies are being extended now to cancer biology to gain insights into the molecular mechanism of oncogenesis.
Cancer-specific altered gene products arising as a result of alterations at the genomic or transcriptomic level can be identified using different proteogenomic strategies.
Several tools and databases are currently available for proteogenomic analysis. However, these need to be further organized by integrating them with mass spectrometry search algorithms to identify mutant peptides, fusion proteins, pseudogenes, and ncRNAs.