1,102
Views
10
CrossRef citations to date
0
Altmetric
Review Article

CGGBP1—an indispensable protein with ubiquitous cytoprotective functions

&
Pages 219-232 | Received 25 Jun 2015, Accepted 20 Aug 2015, Published online: 20 Oct 2015

Figures & data

Figure 1. Evolution and structure of CGGBP1. A: A schematic depicting the known and predicted domains and functional sites in human CGGBP1. The SH2 domain, the C2H2 domain, and the nuclear localization signal (NLS) are highlighted. The three tyrosine residues (positions 20, 150, 155) and one serine residue (position 164) are marked out. The cellular effects of phosphorylation of these amino acids have been studied. B: An I-TASSER structure prediction using CGGBP1 amino acid sequence predicts sequence-based structural similarities with proteins Hermes DNA transposase (2BW3; in the C-terminal half) and with ZNF346 from Xenopus laevis (1ZU1; all throughout the peptide sequence). The NLS and C2H2 DNA-binding domain (DBD) have been highlighted. C: The predicted 3-dimensional structure of CGGBP1 from different angles of view. The C- and N-termini are marked as ‘C’ and ‘N’, respectively. The two cysteine and histidine residues forming the C2H2 Zn finger domain are also identifiable through their side chains that converge at the zinc ion (shown).

Figure 1. Evolution and structure of CGGBP1. A: A schematic depicting the known and predicted domains and functional sites in human CGGBP1. The SH2 domain, the C2H2 domain, and the nuclear localization signal (NLS) are highlighted. The three tyrosine residues (positions 20, 150, 155) and one serine residue (position 164) are marked out. The cellular effects of phosphorylation of these amino acids have been studied. B: An I-TASSER structure prediction using CGGBP1 amino acid sequence predicts sequence-based structural similarities with proteins Hermes DNA transposase (2BW3; in the C-terminal half) and with ZNF346 from Xenopus laevis (1ZU1; all throughout the peptide sequence). The NLS and C2H2 DNA-binding domain (DBD) have been highlighted. C: The predicted 3-dimensional structure of CGGBP1 from different angles of view. The C- and N-termini are marked as ‘C’ and ‘N’, respectively. The two cysteine and histidine residues forming the C2H2 Zn finger domain are also identifiable through their side chains that converge at the zinc ion (shown).

Figure 2. Sequence similarity between DNA transposons and CGGBP1 indicates a common origin. A sequence alignment of CGGBP1 against Charlie group of hAT transposases suggests that CGGBP1 evolved from the DBD of these transposases. Interestingly, the two cysteine and histidine residues constituting the C2H2 domain are conserved across all the sequences analysed, suggesting an evolutionary pressure to preserve the DBD.

Figure 2. Sequence similarity between DNA transposons and CGGBP1 indicates a common origin. A sequence alignment of CGGBP1 against Charlie group of hAT transposases suggests that CGGBP1 evolved from the DBD of these transposases. Interestingly, the two cysteine and histidine residues constituting the C2H2 domain are conserved across all the sequences analysed, suggesting an evolutionary pressure to preserve the DBD.

Figure 3. Common sequence features in CGGBP1-binding sites in Alu-SINEs and L1-LINEs. Binding sites of CGGBP1 on Alu and L1 elements have sequence similarity. An alignment of Alu and L1 DNA sequences of regions at which CGGBP1-binding peaks shows degeneracy of sequence such that the transcription factor-binding sites for EGR1 and E2F1 (deduced using JASPAR and Transfac) are conserved (region in bold, marked with EGR1 and E2F1). An additional region of similarity (brown underlined bold) seems to be conserved and complementarily inverted between L1 and Alu elements. This region (5′-GGAYTACA-3′) is a part of the Alu transcription enhancer region and a major binding site for CGGBP1. (Citation37).

Figure 3. Common sequence features in CGGBP1-binding sites in Alu-SINEs and L1-LINEs. Binding sites of CGGBP1 on Alu and L1 elements have sequence similarity. An alignment of Alu and L1 DNA sequences of regions at which CGGBP1-binding peaks shows degeneracy of sequence such that the transcription factor-binding sites for EGR1 and E2F1 (deduced using JASPAR and Transfac) are conserved (region in bold, marked with EGR1 and E2F1). An additional region of similarity (brown underlined bold) seems to be conserved and complementarily inverted between L1 and Alu elements. This region (5′-GGAYTACA-3′) is a part of the Alu transcription enhancer region and a major binding site for CGGBP1. (Citation37).

Figure 4. L1-LINEs function as CGGBP1-dependent cis-regulatory elements for growth-responsive genes. A: A subset of genes undergo expression changes upon growth stimulation (Stimulated) of quiescent cells (Starved) in a CGGBP1-dependent manner. The presence of CGGBP1 in normal levels, or its depletion, can dictate their induction of silencing upon growth stimulation (10% serum used in this case). B: Genes which are suppressed by serum stimulation in the presence of CGGBP1 are rich in L1 content in their 1kb proximal promoters unlike genes which are induced by serum stimulation in the presence of CGGBP1. The top left quadrant has 25% of all genes containing L1 elements recognized by Repeatmasker. The bottom right quadrant has only one gene containing L1. The areas of the circles represent the percentage of L1 content in the 1kb promoter region. The top right and bottom left quadrants with grey/black data points represent those genes which are unaffected by CGGBP1 levels and serum stimulation or starvation.

Figure 4. L1-LINEs function as CGGBP1-dependent cis-regulatory elements for growth-responsive genes. A: A subset of genes undergo expression changes upon growth stimulation (Stimulated) of quiescent cells (Starved) in a CGGBP1-dependent manner. The presence of CGGBP1 in normal levels, or its depletion, can dictate their induction of silencing upon growth stimulation (10% serum used in this case). B: Genes which are suppressed by serum stimulation in the presence of CGGBP1 are rich in L1 content in their 1kb proximal promoters unlike genes which are induced by serum stimulation in the presence of CGGBP1. The top left quadrant has 25% of all genes containing L1 elements recognized by Repeatmasker. The bottom right quadrant has only one gene containing L1. The areas of the circles represent the percentage of L1 content in the 1kb promoter region. The top right and bottom left quadrants with grey/black data points represent those genes which are unaffected by CGGBP1 levels and serum stimulation or starvation.

Figure 5. A schematic view of the mechanisms through which CGGBP1 acts as a sensor of growth-induced stress and acts in the nuclei as a cytoprotector agent. The topmost box depicts the signalling pathways that extranuclear CGGBP1 is demonstrated or indicated to participate in. The CGGBP1 then localizes to the nuclei and executes a multitude of functions as depicted in the lower boxes to which the arrows lead. CGGBP1 participates in signal transduction and undergoes Y20 phosphorylation. The EGF and PDGF-induced phosphorylation of CGGBP1 has been proven in human cells, whereas other interactions with MAGI3, TYRO3, and MAP4K4 are deduced from protein–protein interaction studies in yeast. Anticlockwise from the top, CGGBP1 in nuclei binds to unmethylated DNA and promotes transcription of factors that promote cytosine demethylation. It also binds to transcription-regulatory regions of Alu-SINEs in growing cells thereby freeing RNA Pol III from unnecessary binding there. On the tRNA genes, CGGBP1 allows transcription under conditions of growth by not binding there. In quiescent cells, CGGBP1 does not bind to tRNA or Alu-SINEs. This aids in deployment of RNA Pol III at growth-promoting targets. Nuclear CGGBP1 binds to mitotic chromosomes, and its dysfunction causes telomeric damage resulting in chromosomal fusions and abscission failures. CGGBP1 presence at midbody seems to regulate abscission checkpoint and prevent tetraploidization. Whether CGGBP1 detects the presence of unsegregated DNA at mitotic bridges is an interesting matter of investigation. The nature of CGG repeats, CGGBP1-binding sites, suggests that CGGBP1 dysfunction might result in endogenous DNA damage by allowing formation of secondary and tertiary structures by CGG repeats, such as G4 quadruplexes, that can halt replication fork progression.

Figure 5. A schematic view of the mechanisms through which CGGBP1 acts as a sensor of growth-induced stress and acts in the nuclei as a cytoprotector agent. The topmost box depicts the signalling pathways that extranuclear CGGBP1 is demonstrated or indicated to participate in. The CGGBP1 then localizes to the nuclei and executes a multitude of functions as depicted in the lower boxes to which the arrows lead. CGGBP1 participates in signal transduction and undergoes Y20 phosphorylation. The EGF and PDGF-induced phosphorylation of CGGBP1 has been proven in human cells, whereas other interactions with MAGI3, TYRO3, and MAP4K4 are deduced from protein–protein interaction studies in yeast. Anticlockwise from the top, CGGBP1 in nuclei binds to unmethylated DNA and promotes transcription of factors that promote cytosine demethylation. It also binds to transcription-regulatory regions of Alu-SINEs in growing cells thereby freeing RNA Pol III from unnecessary binding there. On the tRNA genes, CGGBP1 allows transcription under conditions of growth by not binding there. In quiescent cells, CGGBP1 does not bind to tRNA or Alu-SINEs. This aids in deployment of RNA Pol III at growth-promoting targets. Nuclear CGGBP1 binds to mitotic chromosomes, and its dysfunction causes telomeric damage resulting in chromosomal fusions and abscission failures. CGGBP1 presence at midbody seems to regulate abscission checkpoint and prevent tetraploidization. Whether CGGBP1 detects the presence of unsegregated DNA at mitotic bridges is an interesting matter of investigation. The nature of CGG repeats, CGGBP1-binding sites, suggests that CGGBP1 dysfunction might result in endogenous DNA damage by allowing formation of secondary and tertiary structures by CGG repeats, such as G4 quadruplexes, that can halt replication fork progression.

Figure 6. Quantification of differential promoter usage at CGGBP1 locus in normal and cancer samples (cBioportal and TCGA databases). The quantitative CAGE data available for different transcript termini (5′ end) for each were assorted into ‘non-cancer’ and ‘cancer’ groups manually, and t test was performed to detect differential promoter usage in non-cancer versus cancer tissues/cells. While p1 is the most dominant promoter, p2 clearly has the most significant cancer-specific induction. The effects of longer 5′UTR associated with p2-specific transcription in regulation of CGGBP1 p2 transcript are currently unknown.

Figure 6. Quantification of differential promoter usage at CGGBP1 locus in normal and cancer samples (cBioportal and TCGA databases). The quantitative CAGE data available for different transcript termini (5′ end) for each were assorted into ‘non-cancer’ and ‘cancer’ groups manually, and t test was performed to detect differential promoter usage in non-cancer versus cancer tissues/cells. While p1 is the most dominant promoter, p2 clearly has the most significant cancer-specific induction. The effects of longer 5′UTR associated with p2-specific transcription in regulation of CGGBP1 p2 transcript are currently unknown.

Figure 7. Direct and indirect gene expression regulation by CGGBP1 is directed at specific functional categories that justify known functions of CGGBP1 so far. The TCGA and cBIOPORTAL databases were mined to fish out genes that exhibit significant positive or inverse correlation with CGGBP1 expression in various cancers. These genes, defined as CGGBP1-co-varying genes, belong to specific functional categories that overlap with the functions where CGGBP1 has been shown to participate or has been implicated based on preliminary findings. The co-variance of these genes with CGGBP1 thus indicates that CGGBP1 acts as a common (direct or indirect) underlying regulator of their expression in cancer.

Figure 7. Direct and indirect gene expression regulation by CGGBP1 is directed at specific functional categories that justify known functions of CGGBP1 so far. The TCGA and cBIOPORTAL databases were mined to fish out genes that exhibit significant positive or inverse correlation with CGGBP1 expression in various cancers. These genes, defined as CGGBP1-co-varying genes, belong to specific functional categories that overlap with the functions where CGGBP1 has been shown to participate or has been implicated based on preliminary findings. The co-variance of these genes with CGGBP1 thus indicates that CGGBP1 acts as a common (direct or indirect) underlying regulator of their expression in cancer.

Table I. A list of proteins that interact with CGGBP1. The unpublished findings (Singh and Westermark) are based on a yeast two-hybrid screen using full-length human CGGBP1 as bait against a human brain cDNA library.