877
Views
0
CrossRef citations to date
0
Altmetric
Editorial

Glycobiology and proteomics: has mass spectrometry moved the field forward?

& ORCID Icon
Pages 303-307 | Received 27 Apr 2023, Accepted 26 Jul 2023, Published online: 05 Sep 2023

1. Introduction

Glycosylation is one of most important and common protein modifications and many enzymes are responsible for protein glycosylation in cells [Citation1]. In eukaryotes, there are two major types of glycosylation, i.e., N-glycosylation and mucin-type O-glycosylation. For N-glycosylation, N-glycans are covalently bound to the asparagine residue with the consensus motif of ‘N-X-S/T’ (X can be any amino acid other than proline). This modification primarily occurs in the classical secretory pathway and is critical for protein folding and trafficking. It plays vital roles in regulating nearly every extracellular activity, such as cell-cell communication, cell-matrix interactions, and cell immune response [Citation2].

For mucin-type O-glycosylation, it initiates with GalNAc being covalently attached to the serine and threonine residues [Citation3]. There are eight kinds of core O-glycan structures in human cells. This type of modification is commonly found on mucins and many other glycoproteins. Protein O-GlcNAcylation is a unique type of glycosylation, which normally occurs in the nucleus and the cytoplasm of eukaryotic cells. It is involved in many cellular activities, including signal transduction and gene transcription [Citation4,Citation5]. Besides, there are multiple other types of glycosylation, such as O-fucosylation,O-mannosylation, and C-mannosylation [Citation1], and they have diverse functions in regulating protein activities and cellular events. In the past decades, mass spectrometry (MS)-based proteomics has been extensively applied to study glycoproteins from the identification of glycoproteins, localization of glycosylation sites, elucidation of glycan structures, to quantification of glycoproteins in different samples (), making significant contributions to advance our understanding of the properties and functions of glycoproteins.

Figure 1. Advances in MS-based glycoproteomics enable multifaceted investigation of glycoproteins. (a) Coupling effective enrichment with glycoproteomics allows for the discovery of unknown glycoproteins in the biological system. (b) Tandem MS enables us to pinpoint glycosylation sites. (c) Development of fragmentation methods and bioinformatic toolkits facilitates the elucidation of glycan. (d) Quantitative glycoproteomics is applied to systematically investigate the properties and functions of glycoproteins, and their implications in diseases ( is adapted with permission from ref [Citation6]. Copyright American chemical Society.).

Figure 1. Advances in MS-based glycoproteomics enable multifaceted investigation of glycoproteins. (a) Coupling effective enrichment with glycoproteomics allows for the discovery of unknown glycoproteins in the biological system. (b) Tandem MS enables us to pinpoint glycosylation sites. (c) Development of fragmentation methods and bioinformatic toolkits facilitates the elucidation of glycan. (d) Quantitative glycoproteomics is applied to systematically investigate the properties and functions of glycoproteins, and their implications in diseases (Figure 1b is adapted with permission from ref [Citation6]. Copyright American chemical Society.).

2. MS-based proteomics allows for the discovery of glycoproteins in various biological systems

Unlike a protein sequence that can be inferred from the corresponding DNA or mRNA sequence, the presence of glycosylation is not predictable. Therefore, it is very necessary to detect them experimentally. Previously, the identification of glycoproteins relied on antibody, and only a small number of proteins can be studied at a time. MS-based proteomics provides a unique opportunity to globally characterize glycoproteins due to their high throughput, low detection limit, high accuracy, and no requirement of antibody. However, glycoproteins are usually of low abundance in cells, and it is challenging to identify them among many non-modified proteins with much high abundance [Citation7]. To achieve comprehensive analysis of glycoproteins in a complex biological sample, effective enrichment is imperative prior to MS analysis.

Currently, glycoproteins can be identified on a global scale through effective enrichment coupled with MS-based proteomics [Citation8–10]. For protein N-glycosylation, more than 4000 N-glycosylation sites were identified from almost 2000 N-glycoproteins in human cells in one study [Citation11]. For mucin-type O-glycosylation, more than 3000 glycosylation sites were identified on over 1000 proteins in human kidney tissues, T cells, and serum [Citation12]. Regarding protein O-GlcNAcylation, in our recent work, we site-specifically identified more than 900 O-GlcNAcylated proteins in MCF7 cells through integrating metabolic labeling, click chemistry, and MS-based proteomics [Citation13]. Glycoproteins on the cell surface are responsible for the regulation of nearly every extracellular activity, and in our lab, we developed an effective method coupling metabolic labeling, biorthogonal chemistry and MS-based proteomics to globally and site-specifically analyze glycoproteins on the cell surface [Citation14], and this method was applied to quantify the dynamics of glycoproteins on the cell surface [Citation15,Citation16]. Based on the proteomic results, some curated databases were established for different types of glycosylation (). Glycoproteomic studies unveil the identities of thousands of proteins with various types of glycosylation, which provides valuable information to advance our understanding of glycoproteins in biological systems.

Table 1. Some curated databases for different types of protein glycosylation.

3. Pinpointing glycosylation sites and elucidating glycan structures by MS

For N-glycosylation and mucin-type O-glycosylation, glycans normally contain different numbers of monosaccharide units with various linkages. Due to the heterogeneity and the complexity of glycans, it is extremely challenging to localize glycosylation sites on proteins and to elucidate glycan structures using traditional methods. Modern MS-based proteomics enables global and confident localization of glycosylation sites. To identify N-glycosylation sites, an enzyme (PNGase F) is commonly employed to remove N-glycans in heavy oxygen water, generating a distinctive mass tag for site-specific analysis by MS [Citation21]. Coupling with an effective enrichment method, N-glycosylation sites on proteins from different samples can be globally identified [Citation22,Citation23]. Alternatively, without de-glycosylation, tandem MS may be complicated, but contain a wealth of information about glycan structures. The presence of some monosaccharide units (such sialic acid in N-glycans) may serve as biomarkers for disease detection [Citation24]. To better identify intact N-glycopeptides, different fragmentation methods were employed, such as higher-energy collisional dissociation (HCD), stepped collision energy HCD (sceHCD), activated ion electron transfer dissociation (AI-ETD), and ultraviolet photodissociation (UVPD) [Citation25]. HCD and sceHCD are relatively popular for N-glycopeptide analysis, and three major types of ions are generated: oxonium ions (fragments of monosaccharides); peptide fragments (b/y ions); glycan fragments with the peptide attached. Due to the various types of ions generated, the spectra are often very complicated. Newly developed software, including pGlyco3 [Citation26] and MSFragger-Glyco [Citation27], can allow us to extract valuable information about glycopeptides. Recently, ion mobility mass spectrometry has gained popularity in glycomics and glycoproteomics because it can separate structurally similar glycans and glycopeptides based on their conformation differences. For example, it was reported that ion mobility can improve both the identification and quantification of N-glycopeptides during MS analysis [Citation28].

For mucin-type O-linked glycosylation, the site localization is more challenging because the linkage of glycans with the peptide backbone is fragile during MS analysis, and the occurrence of the serine and threonine residues near the glycosylation sites is high. It was reported that electron-transfer/higher-energy collision dissociation (EThcD) could minimize the cleavage of glycosidic linkages during peptide backbone fragmentation, which enables the confident site localization for O-linked glycosylation [Citation29]. Furthermore, cells can be genetically engineered to produce homogenous truncated O-glycans, which simplified the enrichment and analysis of O-glycopeptides by MS [Citation30]. Additionally, some database search software were developed for protein O-glycosylation analysis, including O-Pair [Citation31], which improves the confidence of O-glycosylation localization and minimizes the search time. Recently, a couple of proteases that specifically cleave at the N-termini of serine or threonine modified with mucin-type O-glycan were reported for studying protein O-glycosylation [Citation12,Citation32]. The site localization and glycan structure are pivotal for investigating the effects of glycosylation on protein properties and functions.

4. Quantitative proteomics reveals the properties and functions of glycoproteins

Although many glycoproteins have been identified, for most of them, their functions remain largely unknown. The emergence of quantitative proteomic methods for accurate and comprehensive quantification of glycoproteins in multiple samples under different conditions greatly expedites the investigation of glycosylation in regulation of cellular activities and in disease samples. Moreover, bioinformatic analysis helps extract valuable information from the changes of protein glycosylation, including biological processes and pathways involved in the glycosylation changes. The commonly used quantitative proteomics methods include stable isotope labeling by amino acids in cell culture (SILAC), in which heavy amino acids are added in cell culture to label glycoproteins for MS analysis [Citation33]. The tandem mass tag (TMT) method is also frequently employed. Glycopeptides in different samples are chemically labeled using the TMT reagents, respectively, and the intensities of the reporter ions in tandem MS can allow for quantifying glycopeptides/glycoproteins in multiple samples simultaneously, which increases the throughput and quantification accuracy [Citation34]. Label-free quantification (LFQ) does not require any pre-labeling of glycopeptides, and the intensities of glycopeptides in different runs can be directly compared. This technique is beneficial for speedy sample preparation and for preventing sample loss [Citation22].

With quantitative glycoproteomics, the properties and functions of glycoproteins can be investigated, including glycoprotein dynamics, glycosylation stoichiometry, glycoprotein interactions with other macromolecules (such as DNA, RNA, and proteins), glycoprotein distribution in different subcellular compartments, and the functions of glycosylation in various biological processes. Additionally, the roles of glycosylation in multiple diseases, such as cancer, neurodegenerative diseases, and diabetes, can be studied. There is great potential to use quantitative glycoproteomics for studying glycoproteins in biological systems to uncover their functions. Quantitative glycoproteomics is expected to have extensive applications for disease studies to unravel the underlying mechanisms, and to discover glycoproteins and/or enzymes responsible for aberrant protein glycosylation as biomarkers for disease detection and as drug targets for disease treatment.

5. Expert opinion

Currently, MS-based glycoproteomics has emerged as a powerful tool for studying glycoproteins in the biological, biochemical, and biomedical research fields. Specifically, the discovery of previously unknown glycoproteins and glycosylation sites has paved the way for the follow-up studies regarding the effect of glycosylation on protein properties and functions and the regulation of cellular activities by glycosylation. For example, MS-based proteomics reveals the identities of many O-GlcNAcylated proteins, and their functions in the regulation of cell signaling and gene expression have been gradually recognized. Moreover, quantitative glycoproteomics provides a unique opportunity to discover biomarkers in human diseases, including cancer, diabetes, and neurodegenerative diseases. However, there are still some limitations that hinder its wider applications in the field of glycobiology. 1) Glycopeptides and glycans are detected by MS based on their masses. However, some glycans have the exact same molecular weight but different structures, which may not be easily distinguished by MS. For example, for proteins modified by O-GlcNAc and O-GalNAc, both glycans have the same molecular weight and composition, but they have distinctive functions. It is not easy to distinguish them during MS analysis. Furthermore, each of different monosaccharide units with the same composition and molecular weight, such as glucose, galactose, mannose and fructose, may exist in a glycan. In order to elucidate the glycan structure, we need to figure out the exact monosaccharide unit. Therefore, novel and effective methods, such as chemical/enzymatic derivatization methods or fragmentation methods, need to be developed. 2) Some existing methods for glycoprotein enrichment are not highly effective. For example, HILIC is based on the hydrophilicity differences between glycopeptides and non-glycopeptides, and it lacks the specificity. For lectin-based methods, every lectin can recognize certain glycan(s), and thus, single or several lectins may not be able to cover glycopeptides with highly diverse glycans. Furthermore, lectins as macromolecules also result in strong nonspecific binding. Therefore, innovative and effective methods for selective enrichment of glycoproteins/glycopeptides need to be further developed. 3) The sample preparation for proteomic analysis is lengthy. Normally it requires the purification and fractionation of samples before MS analysis, which is not ideal for processing many samples. Easy and rapid sample handling techniques and automated sample processing will further improve glycoprotein studies. 4) Glycans do not contain protonation sites, resulting in relatively low protonation efficiencies of glycopeptides during MS analysis. Moreover, the fragmentation of glycans has different optimal activation energy compared with the peptide backbones. Optimization of MS parameters for glycopeptide analysis would facilitate glycopeptide identification and deconvolution of glycan structures. Despite the current limitations, MS-based proteomics will play a critical role in the field of glycobiology thanks to its incomparable throughput and sensitivity. It is foreseeable that MS-based glycoproteomics will be more popular in the field of glycobiology and will further advance glycoscience in the following aspects: 1) Coupling with new enrichment methods and fragmentation techniques, more in-depth coverage of glycoproteins can be achieved. 2) Quantitative glycoproteomics will become more popular in the study of cellular events and disease pathologies because it enables the quantification of many glycoproteins simultaneously. 3) The improvement of the MS instrumentation, automated sample handling processes, and the ever-evolving database search software make single-cell proteomics possible, which will help capture the glycoproteome perturbation on high resolution. 4) Coupling with other omics technologies (multi-omics), including genomics, transcriptomics, and metabolomics, the functions of glycoproteins will be further revealed. In conclusion, MS-based proteomics has unprecedentedly advanced the study of glycoproteins, from the characterization of glycoproteins to their functional studies.

Declaration of interest

The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

Reviewer disclosures

Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Additional information

Funding

This paper was funded by the National Institute of General Medical Sciences of the National Institutes of Health (R01GM118803 and R01GM127711).

References

  • Spiro RG. Protein glycosylation: nature, distribution, enzymatic formation, and disease implications of glycopeptide bonds. Glycobiology. 2002;12(4):43R–56R. doi: 10.1093/glycob/12.4.43R
  • Synthesis BE. Processing, and function of N-glycans in N-glycoproteins. Adv Neurobiol. 2014;9:47–70.
  • Jensen PH, Kolarich D, Packer NH. Mucin-type O-glycosylation–putting the pieces together. FEBS J. 2010;277(1):81–94. doi: 10.1111/j.1742-4658.2009.07429.x
  • Cheng XG, Cole RN, Zaia J, et al. Alternative O-glycosylation/O-phosphorylation of the murine estrogen receptor beta. Biochemistry. 2000;39(38):11609–11620. doi: 10.1021/bi000755i
  • Xu S, Tong M, Suttapitugsakul S, et al. Spatial and temporal proteomics reveals the distinct distributions and dynamics of O-GlcNAcylated proteins. Cell Rep. 2022;39(11):110946. doi: 10.1016/j.celrep.2022.110946
  • Sun F, Suttapitugsakul S, Wu R. Enzymatic tagging of glycoproteins on the cell surface for their global and site-specific analysis with mass spectrometry. Anal Chem. 2019;91(6):4195–4203. doi: 10.1021/acs.analchem.9b00441
  • Xiao HP, Suttapitugsakul S, Sun FX, et al. Mass spectrometry-based chemical and enzymatic methods for global analysis of protein glycosylation. Acc Chem Res. 2018;51(8):1796–1806. doi: 10.1021/acs.accounts.8b00200
  • Riley NM, Bertozzi CR, Pitteri SJ. A pragmatic guide to enrichment strategies for mass spectrometry-based glycoproteomics. Mol & Cell Proteomics. 2021;20:100029. doi: 10.1074/mcp.R120.002277
  • Suttapitugsakul S, Sun F, Wu R. Recent advances in glycoproteomic analysis by mass spectrometry. Anal Chem. 2020;92(1):267–291. doi:10.1021/acs.analchem.9b04651
  • Sun FX, Suttapitugsakul S, Wu RH. Systematic characterization of extracellular glycoproteins using mass spectrometry. Mass Spectrom Rev. 2023;42(2):519–545. doi:10.1002/mas.21708
  • Xiao H, Chen W, Smeekens JM, et al. An enrichment method based on synergistic and reversible covalent interactions for large-scale analysis of glycoproteins. Nat Commun. 2018;9(1):1692. doi:10.1038/s41467-018-04081-3
  • Yang WM, Ao MH, Hu YW, et al. Mapping the O-glycoproteome using site-specific extraction of O-linked glycopeptides (EXoO). Mol Syst Biol. 2018;14(11):e8486. doi: 10.15252/msb.20188486
  • Xu S, Yin K, Wu R. Combining selective enrichment and a boosting approach to globally and site-specifically characterize protein co-translational O-GlcNAcylation. Anal Chem. 2023;95(9):4371–4380. doi:10.1021/acs.analchem.2c04779
  • Chen WX, Smeekens JM, Wu RH. Systematic and site-specific analysis of N-sialoglycosylated proteins on the cell surface by integrating click chemistry and MS-based proteomics. Chem Sci. 2015;6(8):4681–4689. doi:10.1039/C5SC01124H
  • Xiao H, Wu R. Quantitative investigation of human cell surface N-glycoprotein dynamics. Chem Sci. 2017;8(1):268–277. doi:10.1039/C6SC01814A
  • Suttapitugsakul S, Tong M, Wu RH. Time-resolved and comprehensive analysis of surface glycoproteins reveals distinct responses of monocytes and macrophages to bacterial infection. Angew Chem Int Ed. 2021;60(20):11494–11503. doi:10.1002/anie.202102692
  • Sun SS, Hu YW, Ao MH, et al. N-Glycositeatlas: a database resource for mass spectrometry-based human N-linked glycoprotein and glycosylation site mapping. Clin Proteomics. 2019;16(1):35. doi: 10.1186/s12014-019-9254-0
  • Huang J, Wu M, Zhang Y, et al. OGP: a repository of experimentally characterized O-glycoproteins to facilitate studies on O-glycosylation. Int J Genomics Proteomics. 2021;19(4):611–618. doi: 10.1016/j.gpb.2020.05.003
  • Hornbeck PV, Zhang B, Murray B, et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015;43(D1):D512–D520. doi:10.1093/nar/gku1267
  • Ma JF, Li YX, Hou CY, et al. O-GlcNAcAtlas: a database of experimentally identified O-GlcNAc sites and proteins. Glycobiology. 2021;31(7):719–723. doi:10.1093/glycob/cwab003
  • Kuster B, Mann M. 18O-labeling of N-glycosylation sites to improve the identification of gel-separated glycoproteins using peptide mass mapping and database searching. Anal Chem. 1999;71(7):1431–1440. doi:10.1021/ac981012u
  • Suttapitugsakul S, Ulmer LD, Jiang C, et al. Surface glycoproteomic analysis reveals that both unique and differential expression of surface glycoproteins determine the cell type. Anal Chem. 2019;91(10):6934–6942. doi:10.1021/acs.analchem.9b01447
  • Chen W, Smeekens JM, Wu R. A universal chemical enrichment method for mapping the yeast N-glycoproteome by mass spectrometry (MS). Mol & Cell Proteomics. 2014;13(6):1563–1572. doi: 10.1074/mcp.M113.036251
  • Pearce OMT, Laubli H. Sialic acids in cancer biology and immunity. Glycobiology. 2016;26(2):111–128. doi: 10.1093/glycob/cwv097
  • Riley NM, Malaker SA, Driessen MD, et al. Optimal dissociation methods differ for N- and O-glycopeptides. J Proteome Res. 2020;19(8):3286–3301. doi: 10.1021/acs.jproteome.0c00218
  • Zeng WF, Cao WQ, Liu MQ, et al. Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3. Nat Methods. 2021;18(12):1515–1523. doi: 10.1038/s41592-021-01306-0
  • Polasky DA, Yu FC, Teo GC, et al. Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat Methods. 2020;17(11):1125–1132. doi: 10.1038/s41592-020-0967-9
  • Fang P, Ji Y, Silbern I, et al. Evaluation and optimization of high-field asymmetric waveform ion-mobility spectrometry for multiplexed quantitative site-specific N-glycoproteomics. Anal Chem. 2021;93(25):8846–8855. doi: 10.1021/acs.analchem.1c00802
  • Reiding KR, Bondt A, Franc V, et al. The benefits of hybrid fragmentation methods for glycoproteomics. TRAC-Trends Anal Chem. 2018;108:260–268. doi: 10.1016/j.trac.2018.09.007
  • Steentoft C, Vakhrushev SY, Vester-Christensen MB, et al. Mining the O-glycoproteome using zinc-finger nuclease-glycoengineered SimpleCell lines. Nat Methods. 2011;8(11):977–982. doi: 10.1038/nmeth.1731
  • Lu L, Riley NM, Shortreed MR, et al. O-Pair search with MetaMorpheus for O-glycopeptide characterization. Nat Methods. 2020;17(11):1133–1138. doi: 10.1038/s41592-020-00985-5
  • Malaker SA, Pedram K, Ferracane MJ, et al. The mucin-selective protease StcE enables molecular and functional analysis of human cancer-associated mucins. Proc Natl Acad Sci U S A. 2019;116(15):7278–7287. doi: 10.1073/pnas.1813020116
  • Smeekens JM, Xiao H, Wu R. Global analysis of secreted proteins and glycoproteins in Saccharomyces cerevisiae. J Proteome Res. 2017;16(2):1039–1049. doi: 10.1021/acs.jproteome.6b00953
  • Nie S, Lo A, Wu J, et al. Glycoprotein biomarker panel for pancreatic cancer discovered by quantitative proteomics analysis. J Proteome Res. 2014;13(4):1873–1884. doi: 10.1021/pr400967x

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.