400
Views
11
CrossRef citations to date
0
Altmetric
Editorial

Where are all the biomarkers?

Pages 681-683 | Published online: 09 Jan 2014

Since the early 2000s, the great hope for proteomics research has focused on the discovery of novel biomarkers, primarily for cancers. One does not have to be an oncologist to understand the impact that effective biomarkers would have on the survival rates of patients diagnosed with cancer. Biomarkers provide benefits whether applied to early diagnosis, prognosis or therapeutic monitoring. What is necessary, however, is that the biomarker can be shown to unambiguously provide a benefit in one of these areas.

Scientists have labored fervently over the past decade to identify cancer biomarkers using proteomic technologies. Foremost among these technologies has been mass spectrometry (MS). Pinpointing the seminal finding that made investigators envisage that MS could be used to discover biomarkers is debatable; however, one finding that must be considered is the publication of almost 1500 yeast proteins using multidimensional protein identification technology (MudPIT) by the laboratory of John Yates 3rd Citation[1]. This familiar technology separates enzymatically digested proteins using strong cation exchange followed by reversed-phase liquid chromatography. The separated peptides are eluted directly into the mass spectrometer and identified using MS/MS. With increasing chromatographic resolution and development of mass spectrometers possessing faster duty cycles and higher sensitivity, resolution and mass accuracy, investigators have claimed the identification of over 10,000 peptides in a single experiment Citation[2]. However, this number pales in comparison to the >100,000 peptide features that have been reported in a single liquid chromatography-MS/MS run Citation[3].

It is worth taking a step back and pondering on how giant a leap these results were in the field of protein chemistry. In 1989, Edman degradation was the primary method of protein sequencing. This method required a cycle time of approximately 1 h per amino acid, and the protein of interest usually had to be isolated and transferred to a membrane prior to analysis Citation[4]. Now, 22 years later, thousands of peptides are routinely identified in complex mixtures using MS and on-line separations. Quantitative approaches were introduced to compare the relative quantitation of the identification peptides. These approaches have included chemical (e.g., isotope-coded affinity tags and isobaric tags for relative and absolute quantitation), enzymatic (16O/18O) and metabolic (e.g., 15N and SILAC) isotope labeling, as well as non-labeling approaches (e.g., exponentially modified protein abundance index and spectral count). The ability to compare the relative quantities of thousands of proteins in one set of samples (e.g., healthy controls) with another (e.g., cancer-affected individuals) is the basic foundation of proteomic biomarker discovery.

Considering how far the technology has progressed and has been applied in numerous studies over the past decade, the question has to be asked ‘Where are all the biomarkers?’ The impact of MS is unquestionable as MS data permeate current scientific literature and progressively the technique has replaced Western blotting as definitive evidence of a protein’s identification. What is lacking are the clinically useful biomarkers that will impact public health. Countless effort has been spent on assessing sample quality, developing robust sample preparation methods, increasing the performance of mass spectrometers and developing ways of analyzing the data. Yet still the promised disease biomarkers have not been delivered. Much of the proteomic community appears to have discarded the discovery phase and are starting to invest their efforts into developing hypothesis-based methods, such as multiple reaction monitoring MS, to deliver the promised biomarkers.

What needs to be changed to start the engine of biomarker discovery that appeared so promising only a decade ago? Let us start at the beginning, with the samples. Only samples from pristinely (not ‘well’) characterized cohorts that have been uniformly stored under acceptable conditions should be analyzed. There is no sense initiating a biomarker discovery study without having complete confidence in the samples being analyzed. The number of samples in each arm of a comparative study needs to increase. The background variation in human samples is enormous. Proteomic biomarker studies have no problem in finding differences between samples. In fact, most studies will find hundreds of differences in protein abundance when comparing sample cohorts. The challenge is recognizing those differences that are related to the disease. I would propose that data from at least 25 (and more if possible) samples from each arm of a study need to be compared. Analyzing this many samples using current technologies will take months, if not years, but it is necessary to differentiate disease-related proteins from background variation. I have often asked individuals in my laboratory the following question: ‘If I asked you to devote 5 years of your scientific career to doing nothing but conducting a biomarker discovery project but could guarantee that you would discover a clinically useful biomarker, would you do it?’ Everyone I have ever asked this of has replied ‘Yes’. Biomarker discovery is a classic ‘high-risk, high-reward’ type of study.

The relative peptide and protein quantitation measurements obtained using MS need to improve. While a number of quantitation strategies have been developed, there is no consensus as to which one is superior. Most laboratories favor the strategy that has provided them with success in the past. For instance, our laboratory utilized isotope-coded affinity tagging (ICAT) several years ago because it worked well in our hands; however, I met a number of investigators who claimed that ICAT did not work. In targeted MS assays to quantitate small molecules, internal standards are spiked into samples to verify the amount being measured, a strategy that is being employed in proteomic biomarker discovery.

Biomarker discovery, and proteomics overall, would benefit from new ionization methods that are not dependent on the chemical properties of each molecule or its environment. ESI and MALDI have revolutionized MS and biological science; however, not every peptide is ionized equally and therefore, the signal that is detected does not reflect the molar equivalence of each molecule in the sample. In addition, the ionization efficiency of an ion is dependent on its environment at the time of its ionization (i.e., other molecules within the solution). Consequently, a tryptic digest of a protein, in which the peptides are present in the same molar equivalence, does not show peaks of equal intensity when analyzed using ESI- or MALDI-MS. A process that would ionize molecules in a fashion analogous to a chemical reaction possessing pseudo-first-order kinetics could be the first step in providing MS signals that actually reflect the quantity of each peptide in the sample. In all likelihood, the next revolution in MS will be the development of a novel ionization source.

Revolutionary data analysis algorithms are also needed to enhance biomarker discovery. The species giving rise to the vast majority (i.e., 75–85%) of MS/MS spectra recorded during a biomarker discovery project are never identified. There are many reasons why these species are not identified, including poor spectral quality, modifications that are not considered in the database search, incompleteness of database and lack of computer processing speed, among others. Much of the clamor over the years has been ‘We need instruments with greater sensitivity to identify biomarkers.’ The evidence suggests that we have plenty of sensitivity Citation[3] – what we lack are the computational tools to turn the detected signals into molecular identifications.

Conclusion & future outlook

Will current proteomic technologies ever deliver on the promise of clinically useful biomarkers? Personally, I think they will. Two things are necessary. First, it is going to require more development at the sample preparation, data acquisition and analysis steps. Second, laboratories need to be empowered with and have the fortitude to devote the necessary resources required to discover biomarkers. In many ways, the peer-review system has done an excellent job in swaying laboratories from pursuing frivolous biomarker-discovery projects that were not going to lead beyond a publication. Studies where a handful of samples are compared and differences are arbitrarily reported, but no reasonable attempt at verification or validation is made, should no longer be pursued. Finding clinically useful biomarkers is not going to be easy; therefore, we need to stop treating it like it is.

Acknowledgements

The author wishes to thank Jacob Veenstra for editorial assistance.

Disclaimer

The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the United States Government.

Financial & competing interests disclosure

This project has been funded in whole or in part with federal funds from the National Cancer Institute, NIH, under Contract N01-CO-12400. The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

No writing assistance was utilized in the production of this manuscript.

References

  • Washburn MP, Wolters D, Yates JR 3rd. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol.19, 242–247 (2001).
  • Monetti M, Nagaraj N, Sharma K, Mann M. Large-scale phosphosite quantification in tissues by a spike-in SILAC method. Nat. Methods8, 655–658 (2011).
  • Michalski A, Cox J, Mann M. More than 100,000 detectable peptide species elute in single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-MS/MS. J. Proteome Res.10, 1785–1793 (2011).
  • Matsudaira P. Sequence from picomole quantities of proteins electroblotted onto polyvinylidene difluoride membranes. J. Biol. Chem.262, 10035–10038 (1987).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.