950
Views
5
CrossRef citations to date
0
Altmetric
Editorial

Biomarker discovery and validation: the tide is turning

Pages 505-507 | Published online: 09 Jan 2014

Despite huge efforts in the area of biomarker discovery over the last decade and the exponential rise in the number of publications in that field, the translation of potential new biomarkers to clinically approved tests has been disappointingly low (<30) Citation[1], and scepticism about the feasibility and value of biomarker research has arisen in some quarters. There is absolutely no doubt that biomarkers can work and represent an important part of the arsenal required for successful treatment of many debilitating diseases, especially cancer, which exact a tremendous toll on modern society. Taking colorectal cancer, which is currently the third most common cancer worldwide as an example, even biomarkers with relatively poor sensitivity and specificity (e.g., fecal occult blood test) have been shown to be associated with reduced mortality Citation[2]. Development of new biomarkers or biomarker panels with improved performance would immediately not only reduce the number of unnecessary follow-up colonoscopies that are currently undertaken due to false-positive results from the current fecal occult blood test, which is the front-line test in many countries, but, with the help of surgical resection when the tumor is still localized, would result in improved survival Citation[2], with enormous savings to the global health budget.

More than 10 years ago, advances in genomics and proteomics were perceived to have altered the landscape of early detection with the promise to expand the repertoire of clinically useful screening tests Citation[2]. So, why then has success to date been so minimal? Key reasons include lack of appropriate standard operating procedures for sample collection and storage, poor (often underpowered) study design and execution, lack of appropriate analytical techniques and, most importantly, promiscuity of individual proteins for multiple pathological conditions. Indeed, it is now realized that individual biomarkers are unlikely to have sufficient sensitivity and specificity, and that the combinatorial power of panels of biomarkers (possibly representing different biological aspects of the disease), ideally requiring multiplexed detection methods, will be required. However, in spite of all the aforementioned caveats and concerns, the tide is definitely turning, and, as was the case in genomics, is largely being driven by advances in technology, especially in the area of proteomics.

The current paradigm for biomarker discovery and validation comprises both a discovery and a translational phase. The National Cancer Institutes (NCI) Early Detection Research Network (EDRN), which was established in 2000, has identified a five-phase approach, namely Discovery, Clinical Assay and Validation, Retrospective Longitudinal Study, Prospective Screening and Evaluation of Cancer Impact Citation[3]. Many key US proteomics laboratories are part of the EDRN. As recognized by the Human Proteome initiative Citation[4], the key platforms supporting global proteomics studies are mass spectrometry, antibody-based techniques and bioinformatics, and all are playing an important role in effective biomarker discovery and validation.

First and foremost among the recent advances in the protein mass spectrometry arena relevant to biomarker studies has undoubtedly been the development of targeted sensitive and specific quantitative techniques such as multiple reaction monitoring (MRM). MRM is a well-established technology for small molecules and has supported the pharmaceutical industry, forensic science and toxicology for over 30 years. However, improvements in the design of QQQ mass spectrometry or mass spectrometry (MS) instruments (in particular speed, sensitivity and dynamic range) now extends this technology for multiplexed quantitative analysis of proteins in complex biological samples Citation[5,6], with the potential to analyze hundreds of potential biomarkers simultaneously. In MRM, specific proteotypic peptides are introduced by HPLC into the first quadrupole of a QQQ instrument and, following collision induced dissociation in the second quadrupole, selected transition ions are monitored in the third sector. Absolute quantitation can be achieved using isotopically labeled internal standards (e.g., JPT Spiketides) Citation[6], which can also be used to validate the sample spectra obtained (equivalent elution time and transition ions). The proteotypic peptides and their characteristic transition ions can be identified from libraries generated during the discovery phase using the biological matrix under investigation Citation[6] or from databases such as SRM Atlas Citation[101]. Selectivity is afforded from the specific LC retention time of the proteotypic peptide, the mass of the proteotypic peptide itself and its characteristic MS/MS transition ions. Since the elution window for each peptide is narrow, individual MRMs need only to be monitored during that time, and hence large numbers of MRMs can be measured during each run (scheduled MRM). Extending the concept of targeted analysis, another recent important technological advance that will further ensure consistent and accurate proteome analysis is Sequential Window Acquisition of all Theoretic Mass Spectra (SWATH) Citation[7]. In this approach, data are acquired on a fast, high-resolution quadrupole–quadrupole time-of-flight instrument by continually cycling through 32 consecutive 25 Da precursor isolation windows (SWATHs). This approach generates a permanent digital record of all fragment ion spectra of each component of the sample, allowing data to be remined ad libitum. SWARTH has the same dynamic range as MRM (with sensitivity in the nanogram per milliliter range), but gives much increased coverage per unit time. Importantly, these targeted technologies afford generic platforms with the potential for detailed biomarker validation on relatively large numbers of clinical samples against multiple targets, something that was not facile when antibody-based assays were used, and specific (and costly) reagents had to be developed and validated for each individual target. Additionally, problems of nonspecific binding and matrix-based effects, which are frequently encountered in ELISA-type assays, are minimized, although in some cases, the preliminary sample preparation is still required to minimize ion suppression effects.

Antibodies do, however, still have important roles to play in proteomics-based biomarker discovery and validation. Immuno-MS techniques (e.g., stable isotope standards with capture by anti-peptide antibodies [SISCAPA]) Citation[8] have been developed to both enrich for low-abundance proteins, thus extending the dynamic range of MRM (to the low pictograms per milliliter range), while simultaneously offering a simple and highly selective form of sample preparation using specific antipeptide rabbit polyclonal antibodies. More recently, the concept has been extended to use renewable monoclonal antibodies Citation[9], incorporate laboratory automation using magnetic beads for the sample preparation and provide a multiplexed format Citation[9,10]. Because of the selectivity offered by the downstream mass spectrometric detection, the antibody stringency is greatly reduced and, unlike sandwich ELISAs, only a single antibody is required.

Detailed characterization and quantitation of post-translational modifications (PTMs) of proteins, which have the potential to be highly disease specific, are recognized as a key area for biomarker discovery. Immunoaffinity mass spectrometry at the peptide level has been successful in the comprehensive identification of phosphopeptides Citation[11] and shown to be complimentary to multidimensional purification strategies. Reversible phosphorylation of proteins is the most common PTM in cell signalling pathways and is frequently dysregulated in disease states. Aberrant protein glycosylation is also a frequent hallmark of disease, especially oncogenesis. Although progress in glycoproteomics has been technically challenging, recent advances in both selective glycoprotein enrichment (especially multiple lectin affinity) Citation[12] and mass spectrometry (e.g., QTOF, FT-ICR-MS with ECD or IRMPD) Citation[13] are opening up this field.

Another important and rapidly developing area that will assist biomarker studies is top-down proteomics for the analysis of intact proteins, which is now capable of large-scale application to proteins of <50 kDa. A recent study identified 1,220 proteins (yielding over 5,000 proteoforms) from a human lung carcinoma cell line, including 300 membrane proteins Citation[14]. Unlike bottom-up protocols where, following enzymatic digestion, detailed information on PTMs and sequence variants is compromised, in top-down approaches intact proteins are analyzed allowing unequivocal identification and location of specific modifications.

In conjunction with proteomics, integration of genomic and transcriptomic data further enriches the biomarker landscape Citation[15] and offers the potential for comprehensive dissection of specific biological systems at the transcriptional and translational level. Thus, the flow of information from genome to transcriptome to proteome will enable the full complexity of the phenome for specific biological events to be understood Citation[16]. Databases such as Encyclopedia of DNA elements (ENCODE) Citation[17] will prove invaluable in such studies. Furthermore, browsers (e.g., The Protein Browser Citation[102]) enabling detailed and unbiased interrogation of genome, transcriptome, antibody (e.g., The Protein Atlas Citation[18]) and proteomics databases are currently being developed.

Taken together, the current proteomics arsenal, coupled with advances in microarray technology, opens the door for big science Systems Biology with global, collaborative, multidisciplinary initiatives that will require protein chemists, molecular biologists, physicists, engineers, clinicians, bioinformaticians, statisticians, computer scientists, regulatory expertise and the appropriate infrastructure (instrument companies, biobanks, technology platforms, diagnostic companies) with operations such as the Human Proteome initiative and EDRN providing potential templates Citation[19]. However, such efforts can only succeed if there is adequate and appropriate funding support worldwide and, as has been pointed out Citation[19,20], to be effective will require a change in mindset with a move away from the current research model that is typically investigator-centric to bring these big science teams together. We can but hope that these advances in proteomics and related technologies will indeed lead to new and improved biomarker panels for disease detection and surveillance, with the enormous benefits to the global community they will bring with them.

Financial & competing interests disclosure

E Nice is supported, in part, by NHMRC Grants 1010303, 1017078 and 603130 and Monash University. The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

No writing assistance was utilized in the production of this manuscript.

References

  • McDermott JE, Wang J, Mitchell H et al. Challenges in biomarker discovery: combining expert insights with statistical analysis of complex omics data. Expert Opin. Med. Diagn. 7(1), 37–51 (2013).
  • Etzioni R, Urban N, Ramsey S et al. The case for early detection. Nat. Rev. Cancer 3(4), 243–252 (2003).
  • Wagner PD, Srivastava S. New paradigms in translational science research in cancer biomarkers. Transl. Res. 159(4), 343–353 (2012).
  • Legrain P, Aebersold R, Archakov A et al. The human proteome project: current state and future direction. Mol. Cell. Proteomics 10(7), M111.009993 (2011).
  • Anderson L, Hunter CL. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol. Cell. Proteomics 5(4), 573–588 (2006).
  • Ang CS, Rothacker J, Patsiouras H, Gibbs P, Burgess AW, Nice EC. Use of multiple reaction monitoring for multiplex analysis of colorectal cancer-associated proteins in human feces. Electrophoresis 32(15), 1926–1938 (2011).
  • Gillet LC, Navarro P, Tate S et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics 11(6), O111.016717 (2012).
  • Anderson NL, Anderson NG, Haines LR, Hardie DB, Olafson RW, Pearson TW. Mass spectrometric quantitation of peptides and proteins using stable isotope standards and capture by anti-peptide antibodies (SISCAPA). J. Proteome Res. 3(2), 235–244 (2004).
  • Schoenherr RM, Zhao L, Whiteaker JR et al. Automated screening of monoclonal antibodies for SISCAPA assays using a magnetic bead processor and liquid chromatography-selected reaction monitoring-mass spectrometry. J. Immunol. Methods 353(1–2), 49–61 (2010).
  • Whiteaker JR, Zhao L, Anderson L, Paulovich AG. An automated and multiplexed method for high throughput peptide immunoaffinity enrichment and multiple reaction monitoring mass spectrometry-based quantification of protein biomarkers. Mol. Cell. Proteomics 9(1), 184–196 (2010).
  • Di Palma S, Zoumaro-Djayoon A, Peng M et al. Finding the same needles in the haystack? A comparison of phosphotyrosine peptides enriched by immuno-affinity precipitation and metal-based affinity chromatography. J. Proteomics 91C, 331–337 (2013).
  • Zeng Z, Hincapie M, Pitteri SJ et al. A proteomics platform combining depletion, multi-lectin affinity chromatography (M-LAC), and isoelectric focusing to study the breast cancer proteome. Anal. Chem. 83(12), 4845–4854 (2011).
  • Kuzmanov U, Kosanam H, Diamandis EP. The sweet and sour of serological glycoprotein tumor biomarker quantification. BMC Med. 11, 31 (2013).
  • Catherman AD, Durbin KR, Ahlf DR et al. Large-scale top down proteomics of the human proteome: membrane proteins, mitochondria, and senescence. Mol. Cell. Proteomics doi:10.1074/mcp.M113.030114 (2013) ( Epub ahead of print).
  • Harris TJ, McCormick F. The molecular pathology of cancer. Nat. Rev. Clin. Oncol. 7(5), 251–265 (2010).
  • Paik YK, Hancock WS. Uniting ENCODE with genome-wide proteomics. Nat. Biotechnol. 30(11), 1065–1067 (2012).
  • ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012).
  • Uhlen M, Oksvold P, Fagerberg L et al. Towards a knowledge-based human protein atlas. Nat. Biotechnol. 28(12), 1248–1250 (2010).
  • Hood LE, Omenn GS, Moritz RL et al. New and improved proteomics technologies for understanding complex biological systems: addressing a grand challenge in the life sciences. Proteomics 12(18), 2773–2783 (2012).
  • Poste G. Biospecimens, biomarkers, and burgeoning data: the imperative for more rigorous research standards. Trends Mol. Med. 18(12), 717–722 (2012).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.