160
Views
15
CrossRef citations to date
0
Altmetric
Research Article

Statistical Application and Challenges in Global Gel-Free Proteomic Analysis by Mass Spectrometry

, &
Pages 297-307 | Published online: 16 Dec 2008

REFERENCES

  • J. N. Adkins, S. M. Varnum, K. J. Auberry, R. J. Moore, N. H. Angell, R. D. Smith, D. L. Springer, and J. G. Pounds. (2002). Toward a human blood serum proteome: analysis by multidimensional separation coupled with mass spectrometry. Mol. Cell. Proteomics. 1:947–955.
  • G. Alves, A. Ogurtsov, W. Wu, G. Wang, R. Shen, and Y. Yu. (2007). Calibrating E-values for MS2 database search methods. Biol. Direct. 2:26.
  • G. Alves, A. Y. Ogurtsov, and Y. K. Yu. (2007). RAId DbS: Peptide identification using database searches with realistic statistics. Biol. Direct. 2:25.
  • K. Andrew, P. Samuel, N. Alexey, S. Sergey, G. David, and K. Eugene. (2002). Experimental protein mixture for validating tandem mass spectral analysis. OMICS. 6:207–212.
  • A. C. Atkinson, A. N. Donev, and R. D. Tobias. (2007). Optimum Experimental Design, with SAS. Oxford University Press, Oxford.
  • V. Bafna, and N. Edwards. (2001). SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database. Bioinformatics. 17:S13–S21.
  • G. Baggerman, E. Vierstraete, A. De Loof, and L. Schoofs. (2005). Gel-based versus gel-free proteomics: a review. Comb. Chem. High Through. Screen. 8:669–677.
  • G. E. P. Box, and D. R. Cox. (1964). An analysis of transformations. J. Roy. Stat. Soc., Series B. 26:211–252.
  • E. J. Breen, F. G. Hopwood, K. L. Williams, and M. R. Wilkins. (2000). Automatic poisson peak harvesting for high throughput protein identification. Electrophoresis. 21:2243–2251.
  • S. Bridges, G. B. Magee, N. Wang, W. P. Williams, S. C. Burgess, and B. Nanduri. (2007). ProtQuant: a tool for the label-free quantification of MudPIT proteomics data. BMC Bioinformat. 8:1471–2105.
  • S. Bucoa, M. Moraguesb, M. Sergentc, P. Doumenqa, and G. Mille. (2008). An experimental design approach for optimizing polycyclic aromatic hydrocarbon analysis in contaminated soil by pyrolyser-gas chromatography-mass spectrometry. Environ. Res. 104:209–215.
  • B. J. Cargile, J. L. Bundy, and J. L. StephensonJr.. (2004). Potential for false positive identifications from large databases through tandem mass spectrometry. J. Proteome Res. 3:1082–1085.
  • R. Carroll, D. Ruppert, and L. A. Stefanski. (2006). Measurement errors in nonlinear models. Chapman and Hall/CRC.
  • F. Chich, O. David, F. Villers, B. Schaeffer, D. Lutomski, and S. Huet. (2007). Statistics for proteomics: experimental design and 2-DE differential analysis. J. Chromatog. B. 849:261–272.
  • K. Choo, and W. Tham. (2007). Tandem mass spectrometry data quality assessment by self-convolution. BMC Bioinformat. 8:352–363.
  • K. Coombes, K. Baggerly, and J. Morris. (2007). Fundamentals of Data Mining in Genomics and Proteomics. Pre-Processing Mass Spectrometry Data ( Eds. M. Dubitzky, M. Granzow, and D. Berrar. ). Kluwer, Boston. pp. 79–99.
  • R. Craig, and R. C. Beavis. (2004). TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 20:1466–1467.
  • K. Dobbin, and R. Simon. (2005). Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. John Wiley and Sons, New York.
  • L. Ein-Dor, O. Zuk, and E. Domany. (2006). Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. PNAS. 103:5923–5928.
  • J. Elias, and S. Gygi. (2005). Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods. 4:207–214.
  • J. Eng, A. McCormack, and J. I. Yates. (1994). An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:976–989.
  • J. Feng, D. Naiman, and B. Cooper. (2007). Probability-based pattern recognition and statistical framework for randomization: modeling tandem mass spectrum/peptide sequence false match frequencies. Bioinformatics. 23:2210–2217.
  • D. Fenyo, and R. C. Beavis. (2003). A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Anal. Chem. 75:768–774.
  • H. I. Field, D. Fenyo, and R. C. Beavis. (2002). RADARS, a bioinformatics solution that automates proteome mass spectral analysis. Proteomics. 2:36–47.
  • C. Freiberg, and N. A. Brunner. (2002). Genome-wide mRNA profiling: impact on compound evaluation and target identification in anti-bacterial research. Targets 1:20–29.
  • E. T. Fung, and C. Enderwick. (2002). ProteinChip clinical proteomics: computational challenges and solutions. Biotechniques 34–38.40–41. Suppl.
  • E. T. Fung, S. R. Weinberger, E. Gavin, and F. Zhang. (2005). Bioinformatics approaches in clinical proteomics. Expert Rev. Proteomics. 2:847–862.
  • B. Futcher, G. I. Latter, P. Monardo, C. S. McLaughlin, and J. I. Garrels. (1999). A sampling of the yeast proteome. Mol. Cell Biol. 19:7357–7368.
  • M. Gail, R. Pfeiffer, W. Wheeler, and D. Pee. (2008). Probability of detecting disease-associated single nucleotide polymorphisms in case-control genome-wide association studies. Biostatistics. 9:201–215.
  • L. Y. Geer, S. P. Markey, J. A. Kowalak, L. Wagner, M. Xu, D. M. Maynard, X. W. S. Yang, and S. H. Bryant. (2004). Open mass spectrometry search algorithm. J. Proteome Res. 3:958–964.
  • S. Ghaemmaghami, W. K. Huh, K. Bower, R. W. Howson, A. Belle, N. Dephoure, E. K. O'Shea, and J. S. Weissman. (2003). Global analysis of protein expression in yeast. Nature. 425:737–741.
  • A. Goll, and P. Bauer. (2007). Two-stage designs applying methods differing in costs. Bioinformatics. 23:1519–1526.
  • S. Greenland. (1980). The effect of misspecification in the presence of covariate. Amer. J. Epidemiol. 112:564–569.
  • P. Gustafson. (2003). Measurement Error and Misclassification in Statistics and Epidemiology, Chapman and Hall/CRC, Boca Raton.
  • H. Gutstein, and J. Morris. (2007). Laser capture sampling and analytical issues in proteomics. Exp. Rev. Proteomics. 4:627–637.
  • S. P. Gygi, B. Rist, S. A. Gerber, F. Turecek, M. H. Gelb, and R. Aebersold. (1999). Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17:994–999.
  • C. J. Hack. (2004). Integrated transcriptome and proteome data: the challenges ahead. Brief. Funct. Genomic. Proteomic. 3:212–219.
  • P. Hernandez, R. Gras, J. Frey, and R. D. Appel. (2003). Popitam: towards new heuristic strategies to improve protein identification from tandem mass spectrometry data. Proteomics. 3:870–878.
  • J. Hogan, R. Higdon, and E. Kolker. (2006). Experimental standards for high-throughput proteomics. OMICS. 10:152–157.
  • G. W. Horgan. (2007). Sample size and replication in 2D gel electrophoresis studies. J. Proteome Res. 6:2884–2887.
  • C. E. Horak, and M. Snyder. (2002). Global analysis of gene expression in yeast. Funct. Integr. Genomics. 2:171–180.
  • J. Hu, K. Coombes, J. Morris, and K. Baggerly. (2005). The importance of experimental design in proteomic mass spectrometry experiments. Brief. Func. Genom. Proteom. 3:322–331.
  • E. Huttlin, A. Hegeman, A. Harms, and M. Sussman. (2007). Prediction of error associated with false-positive rate determination for peptide identification in large-scale proteomics experiments using a combined reverse and forward peptide sequence database strategy. J. Proteome Res. 6:392–398.
  • T. Ideker, V. Thorsson, J. A. Ranish, R. Christmas, J. Buhler, J. K. Eng, R. Bumgarner, D. R. Goodlett, R. Aebersold, and L. Hood. (2001). Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science. 292:929–934.
  • N. Jeffries. (2005). Algorithms for alignment of mass spectrometry proteomic data. Bioinformatics. 21:3066–3073.
  • R. S. Johnson, and J. A. Taylor. (2002). Searching sequence databases via de novo peptide sequencing by tandem mass spectrometry. Mol. Biotech. 22:301–315.
  • N. A. Karp, and K. S. Lilley. (2005). Maximising sensitivity for detecting changes in protein expression: experimental design using minimal CyDyes. Proteomics. 5:3105–3115.
  • Y. V. Karpievitch, E. G. Hill, A. J. Smolka, J. S. Morris, K. R. Coombes, K. A. Baggerly, and J. S. Almeida. (2007). PrepMS: TOF MS data graphical preprocessing tool. Bioinformatics. 23:264–265.
  • M. Katajamaa, and M. Orešič. (2005). Processing methods for differential analysis of LC/MS profile data. BMC Bioinformat. 6:179.
  • A. Keller, A. I. Nesvizhskii, E. Kolker, and R. Aebersold. (2002). Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74:5383–5392.
  • A. Keller, J. Eng, N. Zhang, X. J. Li, and R. Aebersold. (2005). A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol. Sys. Biol. 1:1–8.
  • B. Ma, K. Zhang, C. Hendrie, C. Liang, M. Li, A. Doherty-Kirby, and G. Lajoie. (2003). PEAKS: powerful software for peptide de novo sequencing by ms/ms. Rapid Commun. Spectrom. 17:2337–2342.
  • D. Mantini, F. Petrucci, D. Pieragostino, P. Boccio, M. Nicola, C. Ilio, G. Federici, P. Sacchetta, S. Comani, and A. Urbani. (2007). LIMPIC: a computational method for the separation of protein MALDI-TOF-MS signals from noise. BMC Bioinformat. 8:101.
  • E. Marengo, E. Robotti, and M. Bobba. (2007). Multivariate statistical tools for the evaluation of proteomic 2D-maps: recent achievements and applications. Curr. Proteomics. 4:53–66.
  • D. C. Montgomery. (2001). Design and Analysis of Experiments, John Wiley & Sons, New York.
  • K. Morgenthal, S. Wienkoop, M. Scholz, J. Selbig, and W. Weckwerth. (2005). Correlative GC-TOF-MS based metabolite profiling and LC-MS based protein profiling reveal time-related systemic regulation of metabolite-protein networks and improve pattern recognition for multiple biomarker selection. Metabolomics 1:109–121.
  • J. Morris, K. Coombes, J. Koomen, K. Baggerly, and R. Kobayashi. (2005). Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics. 21:1764–1775.
  • N. Mujezinovic, G. Raidl, J. Hutchins, J. Peters, K. Mechtler, and F. Eisenhaber. (2006). Cleaning of raw peptide MS/MS spectra: improved protein identification following deconvolution of multiply charged peaks, isotope clusters, and removal of background noise. Proteomics. 6:5117–5131.
  • E. T. Munoz, L. D. Bogarad, and M. W. Deem. (2004). Microarray and EST database estimates of mRNA expression levels differ: the protein length versus expression curve for C. elegans. BMC Genom. 5:30.
  • B. Muthen. (1990). Moments of the censored and truncated bivariate normal distribution. Brit. J. Math. Stat. Psychol. 43:131–143.
  • A. I. Nesvizhskii, A. Keller, E. Kolker, and R. Aebersold. (2003). A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75:4646–4658.
  • L. Nie, G. Wu, and W Zhang. (2006). Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: a multiple regression to identify sources of variations. Biochem. Biophys. Res. Comm. 339:603–610.
  • L. Nie, G. Wu, F. J. Brockman, and W. Zhang. (2006). Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins. Bioinformatics. 22:1641–1647.
  • L. Nie, G. Wu, D. Culley, J. Scholten, and W. Zhang. (2007). Integrative analysis of transcriptomic and proteomic data: challenges, solutions and applications. Crit. Rev. Biotechnol. 27:63–75.
  • P. H. O'Farrell. (1975). High resolution two-dimensional electrophoresis of proteins. J. Biol. Chem. 250:4007–4021.
  • V. Patel, B. Hood, A. Molinolo, N. Lee, T. Conrads, J. Braisted, D. Krizman, T. Veenstra, and J. S. Gutkind. (2008). Proteomic analysis of laser-captured paraffin embedded tissues: a molecular portrait of head and neck cancer progression. Clin. Cancer Res. 14:1002–1014.
  • J. Peng, J. E. Elias, C. C. Thoreen, L. J. Licklider, and S. P. Gygi. (2003). Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2:43–50.
  • D. N. Perkins, D. J. Pappin, D. M. Creasy, and J. S. Cottrell. (1999). Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 20:3551–3567.
  • A. Prakash, B. Piening, J. Whiteaker, H. Zhang, S. Shaffer, D. Martin, L. Hohmann, and K. Cooke. (2007). Assessing bias in experiment design for large scale mass spectrometry-based quantitative proteomics. Mol. Cell. Proteomics. 6:1741–1748.
  • A. Prieto, O. Zuloaga, A. Usobiaga, N. Etxebarria, and L. A. Fernández. (2008). Use of experimental design in the optimisation of stir bar sorptive extraction followed by thermal desorption for the determination of brominated flame retardants in water samples. Anal. Bioanal. Chem. 390:739–748.
  • V. A. Rhodius, and R. LaRossa. (2003). Uses and pitfalls of microarrays for studying transcriptional regulation. Curr. Opin. Microbiol. 6:114–119.
  • P. G. Righetti, A. Castagna, F. Antonucci, C. Piubelli, D. Cecconi, N. Campostrini, H. Astner, and M. Hamdan. (2004). Critical survey of quantitative proteomics in two-dimensional electrophoretic approaches. J. Chromatogr. A. 1051:3–17.
  • M. R. Roe, and T. J. Griffin. (2006). Gel-free mass spectrometry-based high throughput proteomics: tools for studying biological response of proteins and proteomes. Proteomics. 6:4678–4687.
  • R. G. Sadygov, and J. R. YatesIII. (2003). A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. Anal. Chem. 75:3792–3798.
  • R. G. Sadygov, D. Cociorva, and J. R. YatesIII. (2004). Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat. Methods. 1:195–202.
  • A. Sauve, and T. Speed. (2004). Normalization, baseline correction and alignment of high-throughput mass spectrometry data. Proc. Genom. Sig. Process Stats.
  • Q. H. Sheng, H. X. Tang, X. Tao, L. S. Wang, and D. F. Ding. (2003). A novel approach for peptide identification by tandem mass spectrometry. Acta Biochem. Biophys. Sinica. 35:735–740.
  • M. Sköld, L. Rydén, V. Samuelsson, C. Bratt, L. Ekblad, H. Olsson, and B. Baldetorp. (2007). Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry. Bioinformatics. 23:1401–1409.
  • R. M. Simon, E. L. Korn, L. S. McShane, M. D. Radmacher, G. W. Wright, and Y. Zhao. (2004). Design and Analysis of DNA Microarray Investigations. Springer-Verlag, New York.
  • R. D. Smith, G. A. Anderson, M. S. Lipton, C. Masselon, L. Pasa-Tolic, Y. Shen, and H. R. Udseth. (2002). The use of accurate mass tags for high-throughput microbial proteomics. OMICS. 6:61–90.
  • G. Srinubabu, B. V. Ratnam, A. A. Rao, and M. N. Rao. (2008). Development and validation of LC-MS/MS method for the quantification of oxcarbazepine in human plasma using an experimental design. Chem. Pharm. Bull. (Tokyo). 56:28–33.
  • S. Tanner, H. Shu, A. Frank, L. C. Wang, E. Zandi, M. Mumby, and V. Bafna. (2005). InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem. 77:4629–4639.
  • J. Villanueva, J. Philip, C. A. Chaparro, Y. Li, R. Toledo-Crow, L. DeNoyer, M. Fleisher, R. J. Robbins, and P. Tempst. (2005). Correcting common errors in identifying cancer-specific serum peptide signatures. J. Proteome Res. 4:1060–1072.
  • P. Wang, H. Tang, H. Zhang, J. Whiteaker, A. Paulovich, and M. Mcintosh. (2006). Normalization regarding non-random missing values in high-throughput mass spectrometry data. Proc. Pacific Symp. Biocomp. Conf. 315–326.
  • Z. Wang, Y. Chang, Y. Ying, L. Zhu, and Y. Yang. (2007). A parsimonious threshold-independent protein feature selection method through the area under receiver operating characteristic curve. Bioinformatics. 23:2788–2794.
  • M. P. Washburn, A. Koller, G. Oshiro, G. Ulaszek, D. Plouffe, C. Deciu, E. Winzeler, and J. R. YatesIII. (2003). Protein pathway and complex clustering of correlated mRNA and protein expression analyses in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA. 100:3107–3112.
  • W. Weckwerth, K. Wenzel, and O. Fiehn. (2004). Process for the integrated extraction identification, and quantification of metabolites, proteins and RNA to reveal their co-regulation in biochemical networks. Proteomics. 4:78–83.
  • W. Weckwerth, and K. Morgenthal. (2005). Metabolomics: from pattern recognition to biological interpretation. Drug Discov. Today. 10:1551–1558.
  • W. Weckwerth. (2008). Integration of metabolomics and proteomics in molecular plant physiology – coping with the complexity by data-dimensionality reduction. Physiol. Plant. 132:176–189.
  • S. Wolff, A. Otto, D. Albrecht, J. S. Zeng, K. Büttner, M. Glückmann, M. Hecker, and D. Becher. (2006). Gel-free and gel-based proteomics in Bacillus subtilis: a comparative study. Mol. Cell Proteomics. 5:1183–1192.
  • C. F. J. Wu, and M. Hamada. Experiments: Planning, Analysis, and Parameter Design Optimization, Wiley, (2000), New York.
  • X. Wu, W. Tseng, and N. Edwards. (2007). HMMatch: peptide identification by spectral matching of tandem mass spectra using hidden markov models. J. Comput. Biol. 14:1025–1043.
  • Y. Yasui, M. Pepe, M. L. Thompson, B. L. Adam, G. L. WrightJr, Y. Qu, J. D. Potter, M. Winget, M. Thornquist, and Z. Feng. (2003). A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics. 4:449–463.
  • M. Yu, T. Pisitkun, G. Wang, R. Shen, and M. A. Knepper. (2006). LC-MS/MS analysis of apical and basolateral plasma membranes of rat renal collecting duct cells. Mol. Cell. Proteomics. 5:2131–2145.
  • J. Zhang, J. Li, X. Liu, H. Xie, Y. Zhu, and F. He. (2008). A nonparametric model for quality control of database search results in shotgun proteomics. BMC Bioinformat. 9:29.
  • N. Zhang, R. Aebersold, and B. Schwikowski. (2002). A probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data. Proteomics. 2:1406–1412.
  • W. Zhang, M. A. Gritsenko, R. J. Moore, D. E. Culley, L. Nie, K. Petritis, E. F. Strittmatter, D. G. CampII, R. D. Smith, and F. J. Brockman. (2006). A proteomic view of Desulfovibrio vulgaris metabolism as determined by liquid chromatography coupled with tandem mass spectrometry. Proteomics. 6:4286–4299.
  • Z. Zhang, J. R. C. Bast, Y. Yu, J. Li, L. J. Sokoll, A. J. Rai, J. M. Rosenzweig, B. Cameron, Y. Y. Wang, X. Y. Meng, A. Berchuck, C. V. Haaftenday, N. F. Hacker, H. W. A. De Bruijn, A. G. J. Van der Zee, I. J. Jacobs, E. T. Fung, and D. W. Chan. (2004). Three biomarkers identified from serum proteomic analysis for the detection of early stage ovarian cancer. Cancer Res. 64:5882–5890.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.