Abstract
This review describes how intimately proteogenomics and system biology are imbricated. Quantitative cell-wide monitoring of cellular processes and the analysis of this information is the basis for systems biology. Establishing the most comprehensive protein-parts list is an essential prerequisite prior to analysis of the cell-wide dynamics of proteins, their post-translational modifications, their complex network interactions and interpretation of these data as a whole. High-quality genome annotation is, thus, a crucial basis. Proteogenomics consists of high-throughput identification and characterization of proteins by extra-large shotgun MS/MS approaches and the integration of these data with genomic data. Discovery of the remaining unannotated genes, defining translational start sites, listing signal peptide processing events and post-translational modifications, are tasks that can currently be carried out at a full-genomic scale as soon as the genomic sequence is available. Proteomics is increasingly being used at the primary stage of genome annotation and such an approach may become standard in the near future for genome projects. Advantageously, the same experimental proteomic datasets may be used to characterize the specific metabolic traits of the organism under study. Undoubtedly, comparative genomics will experience a renaissance taking into account this new dimension. Synthetic biology aimed at re-engineering living systems will also benefit from these significant progresses.
Financial & competing interests disclosure
This study was supported by the Commissariat à l’Energie Atomique and an ANR grant (JCJC06-152439). The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
Writing assistance was utilized in the production of this manuscript and was supported by the Commissariat à l’Energie Atomique.