574
Views
93
CrossRef citations to date
0
Altmetric
Review

Algorithms for the de novo sequencing of peptides from tandem mass spectra

Pages 645-657 | Published online: 09 Jan 2014

References

  • Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature422, 198–207 (2003).
  • Allmer J. Existing bioinformatics tools for the quantitation of post-translational modification. Amino Acids DOI: 10.1007/s00726-010-0614-0613 (2010) (Epub ahead of print).
  • Eng J, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectr.5, 976–989 (1994).
  • Perkins DN, Pappin DJ, Creasy DM et al. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis20, 3551–3567 (1999).
  • Geer LY, Markey SP, Kowalak JA et al. Open mass spectrometry search algorithm. J. Proteome Res.3, 958–964 (2004).
  • Craig R, Beavis RC. TANDEM: matching proteins with tandem mass spectra. Bioinformatics20, 1466–1467 (2004).
  • Standing KG. Peptide and protein de novo sequencing by mass spectrometry. Curr. Opin. Struct. Biol.13, 595–601 (2003).
  • Menschaert G, Vandekerckhove TTM, Baggerman G et al. Peptidomics coming of age: a review of contributions from a bioinformatics angle. J. Proteome Res.9, 2051–2061 (2010).
  • Bandeira N, Pham V, Pevzner P et al. Automated de novo protein sequencing of monoclonal antibodies. Nat. Biotechnol.26, 1336–1338 (2008).
  • Wells JM, McLuckey SA. Collision-induced dissociation (CID) of peptides and proteins. Meth. Enzymol.402, 148–185 (2005).
  • Sleno L, Volmer DA. Ion activation methods for tandem mass spectrometry. J. Mass Spectrom.39, 1091–1112 (2004).
  • Syka JEP, Coon JJ, Schroeder MJ et al. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc. Natl Acad. Sci. USA101, 9528–9533 (2004).
  • Zubarev RA, Kelleher NL, McLafferty FW. Electron capture dissociation of multiply charged protein cations. A nonergodic process. J. Am. Chem. Soc.120, 3265–3266 (1998).
  • Biemann K. Appendix 5. Nomenclature for peptide fragment ions (positive ions). Methods Enzymol.193, 886–887 (1990).
  • Roepstorff P, Fohlman J. Proposal for a common nomenclature for sequence ions in mass spectra of peptides. Biomed. Mass Spectrom.11, 601 (1984).
  • Seidler J, Zinn N, Boehm ME et al.De novo sequencing of peptides by MS/MS. Proteomics10, 634–649 (2010).
  • Uttenweiler-Joseph S, Neubauer G, Christoforidis S et al. Automated de novo sequencing of proteins using the differential scanning technique. Proteomics1, 668–682 (2001).
  • Fernandez-de-Cossio J, Gonzalez J, Betancourt L et al. Automated interpretation of high-energy collision-induced dissociation spectra of singly protonated peptides by ‘SeqMS’, a software aid for de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom.12, 1867–1878 (1998).
  • Keough T, Lacey MP, Fieno AM et al. Tandem mass spectrometry methods for definitive protein identification in proteomics research. Electrophoresis21, 2252–2265 (2000).
  • An M, Dai J, Wang Q, et al. Efficient and clean charge derivatization of peptides for analysis by mass spectrometry. Rapid Commun. Mass Spectrom.24, 1869–1874 (2010).
  • Chen W, Lee PJ, Shion H et al. Improving de novo sequencing of peptides using a charged tag and C-terminal digestion. Anal. Chem.79, 1583–1590 (2007).
  • Cannon WR, Jarman KD. Improved peptide sequencing using isotope information inherent in tandem mass spectra. Rapid Commun. Mass Spectrom.17, 1793–1801 (2003).
  • Xu C, Ma B. Complexity and scoring function of MS/MS peptide de novo sequencing. Comput. Syst. Bioinformatics Conf.361–369 (2006).
  • Frank AM, Savitski MM, Nielsen ML et al.De novo peptide sequencing and identification with precision mass spectrometry. J. Proteome Res.6, 114–123 (2007).
  • Wong J, Sullivan M, Cartwright H et al. msmsEval: tandem mass spectral quality assignment for high-throughput proteomics. BMC Bioinformatics8, 51 (2007).
  • Savitski MM, Nielsen ML, Kjeldsen F et al. Proteomics-grade de novo sequencing approach. J. Proteome Res.4, 2348–2354 (2005).
  • Spengler B. De novo sequencing, peptide composition analysis, and composition-based sequencing: a new strategy employing accurate mass determination by fourier transform ion cyclotron resonance mass spectrometry. J. Am. Soc. Mass Spectrom.15, 703–714 (2004).
  • Bern M, Goldberg D, McDonald WH et al. Automatic quality assessment of Peptide tandem mass spectra. Bioinformatics20(Suppl. 1), I49–I54 (2004).
  • Purvine S, Kolker N, Kolker E. Spectral quality assessment for high-throughput tandem mass spectrometry proteomics. OMICS8, 255–265 (2004).
  • Olsen JV, Macek B, Lange O et al. Higher-energy C-trap dissociation for peptide modification analysis. Nat. Methods4, 709–712 (2007).
  • Chi H, Sun R-X, Yang B et al. pNovo: de novo peptide sequencing and identification using HCD spectra. J. Proteome Res.9, 2713–2724 (2010).
  • Bern M, Finney G, Hoopmann MR et al. Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry. Anal. Chem.82, 833–841 (2010).
  • Hamm CW, Wilson WE, Harvan DJ. Peptide sequencing program. Comput. Appl. Biosci.2, 115–118 (1986).
  • Sakurai T, Matsuo T, Matsuda H et al. PAAS 3: a computer program to determine probable sequence of peptides from mass spectrometric data. Biol. Mass Spectrom.11, 396–399 (1984).
  • Allmer J. PhD thesis: development of algorithms for peptide identification from mass spectrometric data in genomic databases. University of Münster, Germany (2006).
  • Zubarev R, Mann M. On the proper use of mass accuracy in proteomics. Mol. Cell. Proteomics6, 377–381 (2007).
  • Olson MT, Epstein JA, Yergey AL. De novo peptide sequencing using exhaustive enumeration of peptide composition. J. Am. Soc. Mass Spectrom.17, 1041–1049 (2006).
  • Ma B, Zhang K, Hendrie C et al. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom.17, 2337–2342 (2003).
  • Siegel MM, Bauman N. An efficient algorithm for sequencing peptides using fast atom bombardment mass spectral data. Biol. Mass Spectrom.15, 333–343 (1988).
  • Biemann K, Cone C, Webster BR et al. Determination of the amino acid sequence in oligopeptides by computer interpretation of their high-resolution mass spectra. J. Am. Chem. Soc.88, 5598–5606 (1966).
  • Lu B, Chen T. Algorithms for de novo peptide sequencing using tandem mass spectrometry. Drug Discovery Today Biosilico.2, 85–90 (2004).
  • Sun H, Zhang J, Liu H et al. TVNovo: de novo peptide sequencing for high resolution LTQ-FT mass spectrometry using virtual database searching. Program and Abstracts of 3rd International Conference on Biomedical Engineering and Informatics (BMEI). Yantai, China, 2240–2245 (2010).
  • Pan C, Park B, McDonald W et al. A high-throughput de novo sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry. BMC Bioinformatics11, 118 (2010).
  • Yan B, Pan C, Olman VN et al. A graph-theoretic approach for the separation of b and y ions in tandem mass spectra. Bioinformatics21, 563–574 (2005).
  • Bern M, Goldberg D. De novo analysis of peptide tandem mass spectra by spectral graph partitioning. J. Comput. Biol.13, 364–378 (2006).
  • Frank A, Pevzner P. PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem.77, 964–973 (2005).
  • Lu B, Chen T. A suboptimal algorithm for de novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol.10, 1–12 (2003).
  • Fernandez-de-Cossio J, Gonzalez J, Satomi Y et al. Automated interpretation of low-energy collision-induced dissociation spectra by SeqMS, a software aid for de novo sequencing by tandem mass spectrometry. Electrophoresis21, 1694–1699 (2000).
  • DiMaggio PA, Floudas CA. De novo peptide identification via tandem mass spectrometry and integer linear optimization. Anal. Chem.79, 1433–1446 (2007).
  • Taylor JA, Johnson RS. Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal. Chem.73, 2594–2604 (2001).
  • Dancik V, Addona TA, Clauser KR et al.De novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol.6, 327–342 (1999).
  • Bartels C. Fast algorithm for peptide sequencing by mass spectroscopy. Biol. Mass Spectrom.19, 363–368 (1990).
  • Grossmann J, Roos FF, Cieliebak M et al. AUDENS: a tool for automated peptide de novo sequencing. J. Proteome Res.4, 1768–1774 (2005).
  • Goto MA, Schwabe EJ. A dynamic programming algorithm for finding highest-scoring forbidden-pairs paths with variable vertex scores. In: Bioinformatics Research and Applications. Springer Berlin/Heidelberg, Berlin, Germany, 171–182 (2008).
  • Bafna V, Edwards N. On De Novo Interpretation of Tandem Mass Spectra for Peptide Identification. ACM Press, NY, USA, 9–18 (2003).
  • Chen T, Kao MY, Tepel M et al. A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol.8, 325–337 (2001).
  • Mo L, Dutta D, Wan Y et al. MSNovo: a dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry. Anal. Chem.79, 4870–4878 (2007).
  • Stranz DD, Martin III LB. Derivation of peptide sequence from mass spectral data using the genetic algorithm. J. Biomol. Tech.9, 1–8 (1999).
  • Heredia-Langner A, Cannon WR, Jarman KD et al. Sequence optimization as an alternative to de novo analysis of tandem mass spectrometry data. Bioinformatics20, 2296–2304 (2004).
  • Zhang Z. De novo peptide sequencing based on a divide-and-conquer algorithm and peptide tandem spectrum simulation. Anal. Chem.76, 6374–6383 (2004).
  • Fischer B, Roth V, Roos F et al. NovoHMM: a hidden Markov model for de novo peptide sequencing. Anal. Chem.77, 7265–7273 (2005).
  • Hines WM, Falick AM, Burlingame AL et al. Pattern-based algorithm for peptide sequencing from tandem high energy collision-induced dissociation mass spectra. J. Am. Soc. Mass Spectrom.3, 326–336 (1992).
  • Jagannath S, Sabareesh V. Peptide fragment ion analyser (PFIA): a simple and versatile tool for the interpretation of tandem mass spectrometric data and de novo sequencing of peptides. Rapid Commun. Mass Spectrom.21, 3033–3038 (2007).
  • Chong KF, Ning K, Leong HW et al. Modeling and characterization of multi-charge mass spectra for peptide sequencing. J. Bioinform. Comput. Biol.4, 1329–1352 (2006).
  • Bandeira N, Tsur D, Frank A et al. A new approach to protein identification. In: Research in Computational Molecular Biology. Springer Berlin/Heidelberg, Berlin, Germany, 363–378 (2006).
  • Bandeira N, Olsen JV, Mann M et al. Multi-spectra peptide sequencing and its applications to multistage mass spectrometry. Bioinformatics24, i416–i423 (2008).
  • Olsen JV, Mann M. Improved peptide identification in proteomics by two consecutive stages of mass spectrometric fragmentation. Proc. Natl Acad. Sci. USA101, 13417–13422 (2004).
  • Kaufmann R, Kirsch D, Spengler B. Sequenching of peptides in a time-of-flight mass spectrometer: evaluation of postsource decay following matrix-assisted laser desorption ionisation (MALDI). Int. J. Mass Spectrom. Ion Proc.131, 355–385 (1994).
  • Thompson MS, Cui W, Reilly JP. Fragmentation of singly charged peptide ions by photodissociation at λ = 157 nm. Angew. Chem. Int. Ed. Engl.43, 4791–4794 (2004).
  • Zhang L, Reilly JP. Peptide de novo sequencing using 157 nm photodissociation in a tandem time-of-flight mass spectrometer. Anal. Chem.82, 898–908 (2010).
  • Datta R, Bern M. Spectrum fusion: using Multiple mass Spectra for de novo peptide sequencing. J. Comput. Biol.16, 1169–1182 (2009).
  • Horn DM, Zubarev RA, McLafferty FW. Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry. Proc. Natl Acad. Sci. USA97, 10313–10317 (2000).
  • Zubarev RA, Zubarev AR, Savitski MM. Electron capture/transfer versus collisionally activated/induced dissociations: solo or duet? J. Am. Soc. Mass Spectrom.19, 753–761 (2008).
  • Li X, Lin C, Han L, et al. Charge remote fragmentation in electron capture and electron transfer dissociation. J. Am. Soc. Mass Spectrom.21, 646–656 (2010).
  • Sreevatsa A, Badrunnisa S, Shaukath A et al. Computational diagnostics based on proteomic data-review on approaches and algorithms. Int. J. Binfo. Res.2, 56–66 (2010).
  • Keller A, Purvine S, Nesvizhskii AI et al. Experimental protein mixture for validating tandem mass spectral analysis. OMICS6, 207–212 (2002).
  • Shadforth I, Crowther D, Bessant C. Protein and peptide identification algorithms using MS for use in high-throughput, automated pipelines. Proteomics5, 4082–4095 (2005).
  • Kapp EA, Schutz F, Connolly LM et al. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis. Proteomics5, 3475–3490 (2005).
  • Pitzer E, Masselot A, Colinge J. Assessing peptide de novo sequencing algorithms performance on large and diverse data sets. Proteomics7, 3051–3054 (2007).
  • Pevtsov S, Fedulova I, Mirzaei H et al. Performance evaluation of existing de novo sequencing algorithms. J. Proteome Res.5, 3018–3028 (2006).
  • Bringans S, Kendrick TS, Lui J et al. A comparative study of the accuracy of several de novo sequencing software packages for datasets derived by matrix-assisted laser desorption/ionisation and electrospray. Rapid Commun. Mass Spectrom.22, 3450–3454 (2008).
  • Tabb DL, Saraf A, Yates JR. GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Anal. Chem.75, 6415–6421 (2003).
  • Tabb DL, Ma Z-Q, Martin DB et al. DirecTag: accurate sequence tags from peptide MS/MS through statistical scoring. J. Proteome Res.7, 3838–3846 (2008).
  • Frank A, Tanner S, Bafna V et al. Peptide sequence tags for fast database search in mass-spectrometry. J. Proteome Res.4, 1287–1295 (2005).
  • Searle BC, Dasari S, Turner M et al. High-throughput identification of proteins and unanticipated sequence modifications using a mass-based alignment algorithm for MS/MS de novo sequencing results. Anal. Chem.76, 2220–2230 (2004).
  • Shevchenko A, Sunyaev S, Loboda A et al. Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching. Anal. Chem.73, 1917–1926 (2001).
  • Mackey AJ, Haystead TA, Pearson WR. Getting more from less: algorithms for rapid protein identification with multiple short peptide sequences. Mol. Cell. Proteomics1, 139–147 (2002).
  • Johnson RS, Taylor JA. Searching sequence databases via de novo peptide sequencing by tandem mass spectrometry. Mol. Biotechnol.22, 301–315 (2002).
  • Lu B, Chen T. A suffix tree approach to the interpretation of tandem mass spectra: applications to peptides of non-specific digestion and post-translational modifications. Bioinformatics19(Suppl. 2), II113–II121 (2003).
  • Allmer J, Naumann B, Markert C et al. Mass spectrometric genomic data mining: novel insights into bioenergetic pathways in Chlamydomonas reinhardtii. Proteomics6, 6207–6220 (2006).
  • Allmer J, Markert CH, Stauber EJ et al. A new approach that allows identification of intron-split peptides from mass spectrometric data in genomic databases. FEBS Lett.562, 202–206 (2004).
  • Alves G, Yu YK. Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics. Bioinformatics21, 3726–3732 (2005).
  • Kim S, Bandeira N, Pevzner PA. Spectral profiles, a novel representation of tandem mass spectra and their applications for de novo peptide sequencing and identification. Mol. Cell. Proteomics8, 1391–1400 (2009).
  • Tanner S, Shu H, Frank A et al. InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem.77, 4626–4639 (2005).
  • Bern M, Cai Y, Goldberg D. Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem.79, 1393–1400 (2007).
  • Allmer J, Kuhlgert S, Hippler M. 2DB: a proteomics database for storage, analysis, presentation, and retrieval of information from mass spectrometric experiments. BMC Bioinformatics9, 302–313 (2008).
  • Tessier D, Yclon P, Jacquemin I et al. OVNIp: an open source application facilitating the interpretation, the validation and the edition of proteomics data generated by MS analyses and de novo sequencing. Proteomics10, 1794–1801 (2010).
  • Naumann B, Busch A, Allmer J et al. Comparative quantitative proteomics to investigate the remodeling of bioenergetic pathways under iron deficiency in Chlamydomonas reinhardtii. Proteomics7, 3964–3979 (2007).
  • Tannu N, Hemby S. De novo protein sequence analysis of Macaca mulatta. BMC Genomics8, 270 (2007).
  • Edman P, Högfeldt E, Sillén LG et al. Method for determination of the amino acid sequence in peptides. Acta Chem. Scand.4, 283–293 (1950).
  • Stegemann C, Kolobov A, Leonova YF et al. Isolation, purification and de novo sequencing of TBD-1, the first β-defensin from leukocytes of reptiles. Proteomics9, 1364–1373 (2009).
  • Ma M, Chen R, Ge Y et al. Combining bottom-up and top-down mass spectrometric strategies for de novo sequencing of the crustacean hyperglycemic hormone from cancer borealis. Anal. Chem.81, 240–247 (2009).
  • Ning K, Fermin D, Nesvizhskii AI. Computational analysis of unassigned high-quality MS/MS spectra in proteomic data sets. Proteomics10, 2712–2718 (2010).
  • Tharakan R, Edwards N, Graham DRM. Data maximization by multipass analysis of protein mass spectra. Proteomics10, 1160–1171 (2010).
  • Junqueira M, Spirin V, Balbuena TS et al. Protein identification pipeline for the homology-driven proteomics. J. Proteomics71, 346–356 (2008).
  • Domon B, Aebersold R. Challenges and opportunities in proteomics data analysis. Mol. Cell. Proteomics5, 1921–1926 (2006).
  • Liu C, Song Y, Yan B et al. Fast de novo peptide sequencing and spectral alignment via tree decomposition. Pac. Symp. Biocomput.255–266 (2006).
  • Searle BC, Dasari S, Wilmarth PA et al. Identification of protein modifications using MS/MS de novo sequencing and the OpenSea alignment algorithm. J. Proteome Res.4, 546–554 (2006).
  • Zhong H, Li L. An algorithm for interpretation of low-energy collision-induced dissociation product ion spectra for de novo sequencing of peptides. Rapid Commun. Mass Spectrom.19, 1084–1096 (2005).
  • Taylor JA, Johnson RS. Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom.11, 1067–1075 (1997).
  • Kim S, Gupta N, Bandeira N et al. Spectral dictionaries: integrating de novo peptide sequencing with database search of tandem mass spectra. Mol. Cell. Proteomics8, 53–69 (2009).
  • Menschaert G, Vandekerckhove TTM, Baggerman G et al. A hybrid, de novo based, genome-wide database search approach applied to the sea urchin neuropeptidome. J. Proteome Res.9, 990–996 (2010).
  • Han Y, Ma B, Zhang K. SPIDER: software for protein identification from sequence tags with de novo sequencing error. J. Bioinform. Comput. Biol.3, 697–716 (2005).

Websites

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.