1,336
Views
31
CrossRef citations to date
0
Altmetric
Research Paper

Structure and RNA-binding properties of the Type III-A CRISPR-associated protein Csm3

, , , , &
Pages 1670-1678 | Received 15 Aug 2013, Accepted 16 Sep 2013, Published online: 30 Sep 2013

Abstract

The prokaryotic adaptive immune system is based on the incorporation of genome fragments of invading viral genetic elements into clusters of regulatory interspaced short palindromic repeats (CRISPRs). The CRISPR loci are transcribed and processed into crRNAs, which are then used to target the invading nucleic acid for degradation. The large family of CRISPR-associated (Cas) proteins mediates this interference response. We have characterized Methanopyrus kandleri Csm3, a protein of the type III-A CRISPR-Cas complex. The 2.4 Å resolution crystal structure shows an elaborate four-domain fold organized around a core RRM-like domain. The overall architecture highlights the structural homology to Cas7, the Cas protein that forms the backbone of type I interference complexes. Csm3 binds unstructured RNAs in a sequence non-specific manner, suggesting that it interacts with the variable spacer sequence of the crRNA. The structural and biochemical data provide insights into the similarities and differences in this group of Cas proteins.

Introduction

For a long time, prokaryotic immune systems were believed to be restricted to “innate” immunity mechanisms (e.g., restriction modification systems)Citation1 and to defense mechanisms that result in cell death upon infection (e.g., toxin-antitoxin systems).Citation2 In the past decade, however, it has become clear that prokaryotes have evolved sophisticated and diverse adaptive immune systems that memorize previous attacks of foreign genetic elements. These systems consist of clusters of regulatory interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) proteins.Citation3-Citation5 CRISPR-Cas is a nucleic acid-based defense system against mobile genetic elements such as viruses.Citation5 The CRISPR-Cas machinery distinguishes foreign (non-self) target DNA from (self) targets that are, for example, provided by a host CRISPR locus.Citation6,Citation7

The central element of CRISPR arrays is the arrangement of DNA sequences of variable length (spacers) derived from foreign genetic elements and separated by short 24–48 nt repeat sequences.Citation5 Upon infection, these clusters are transcribed into precursor crRNAs (pre-crRNA), which then are processed into mature CRISPR RNAs (crRNA).Citation8-Citation10 The common features of mature crRNAs are the spacer, which identifies the matching target (protospacer) via base pairing, and the 5′-terminal 8 nt repeat tag (psi-tag), which is complementary to the self DNA but not to 2–4 nt short protospacer adjacent motif (PAM) sequences.Citation11 Adjacent to this array are the cas genes.Citation12,Citation13 These encode proteins that are responsible for mediating the CRISPR response and that have a variety of functions, including nucleic acid binding and cleavage.Citation14

CRISPR-Cas systems have been classified into three main types (I, II, and III) and 10 subtypes by bioinformatic analyses based on their cas gene organization, on the sequence and the structure (known or predicted) of the corresponding proteins.Citation15 The three CRISPR types also differ in the composition and mechanisms of their effector complexes.Citation16 Type I effector complexes are termed Cascade (CRISPR-associated complex for antiviral defense), type II effector complexes consist of a single Cas protein and two RNA molecules, and type III interference complexes are further divided into type III-A (Csm complex targeting DNA) and type III-B (Cmr complex targeting RNA).Citation11,Citation17 In recent years, structural information on Cas proteins has started to provide insights into the molecular mechanisms of crRNA binding and target recognition. The combination of X-ray crystallographyCitation8,Citation18-Citation22 and electron microscopic studies of the type I CascadeCitation23-Citation25 and of the Type III-B Cmr-complexCitation26-Citation28 has shown how some of the Cas proteins interact and bind crRNA.

Type I effector complexes are built around a central backbone composed of proteins of the Cas7 family.Citation23,Citation29 The crystal structure of a Cas7 type I protein has revealed the presence of a central RRM/ferredoxin-like domain with several insertions and a C-terminal extension.Citation29 In type I systems, Cas7 oligomerizes upon crRNA binding. In the best-characterized effector complex so far, the Escherichia coli Cascade complex, the crRNA binds within a super-helical grove formed by six copies of Cas7.Citation23,Citation30 This helical arrangement has also been observed within other type I systems.Citation29,Citation31,Citation32 Despite the absence of significant sequence similarity, bioinformatic analysis has predicted that Cas7-like proteins also exist in type III systems.Citation33 Recently, it was shown that a Csm3 (CRISPR-Cas Subtype Mtube, protein 3) from Staphylococcus epidermidis binds RNA molecules at multiple sites.Citation34 Here, we present the crystal structure and RNA-binding properties of Methanopyrus kandleri Csm3. The structural and biochemical analysis of this type III-A Cas protein indicates that Csm3 is a Cas7-like protein capable of binding crRNA, suggesting it forms the backbone of the CRISPR-Cas Type III-A system effector complex.

Results and Discussion

Structure determination of Csm3

We expressed full-length Methanopyrus kandleri (Mk) Csm3 (351 residues) in E. coli and purified it to homogeneity (Fig. S1A). Mk Csm3 yielded crystals in an orthorhombic space group (C222) containing two molecules per asymmetric unit and diffracting beyond 2.4 Å resolution. An X-ray fluorescence scan on the crystals showed an unexpected peak at the Zinc excitation energy, suggesting the presence of intrinsically bound Zinc ions in the crystallized protein. We exploited the presence of this anomalous scatterer to solve the structure by single-wavelength anomalous dispersion method (SAD). The phases (obtained from a single bound Zinc ion) were of sufficient quality to build the polypeptide chain. The structure was refined at 2.37 Å resolution to an Rfree of 21.0%/Rwork of 18.0% and good stereochemistry (). The final model includes most of the protein, with the exception of a disordered region between residues 200 and 214. The two independent molecules in the asymmetric unit are very similar, superposing with a root mean square deviation (rmsd) of 0.22 Å for more than 95% of the Cα atoms. Static light scattering experiments of Csm3 in solution showed a mass of 33.3 kDa (Fig. S1B), consistent with the presence of a monomeric species. Thus, the interaction of the two molecules in the asymmetric unit reflects crystal packing contacts and not a physiological oligomer.

Table 1. Data collection and structure refinement statstics of Csm3

Csm3 is built of four domains organized around a central RRM-like fold

The crystal structure of Mk Csm3 reveals a compact architecture that can be described as composed of four domains: the core, the lid, the helical, and the C-terminal domains (, in green, blue, red, and yellow, respectively). The core domain has a β1-α1-β2-β3-α2-β4 arrangement of secondary structure elements () with a topology typical of RRM-like and ferredoxin-like folds. Accordingly, the Mk Csm3 core domain folds into an antiparallel β-sheet, with two α-helices packed against the concave (back) surface. However, several features set the Mk Csm3 core domain apart from canonical RRM-like folds. In the β-sheet, strand β1 is long and highly bent, with a glycine residue (Gly12) at the bending point effectively dividing it into two separate structural elements (strands β1A and β1B, ). Strands β3 and β4, which sandwich β1, are also elongated (~12 residues), while strand β2 is very short (three residues).

Figure 1. Structure of Methanopyrus kandleri Csm3. (A) The structure of Mk Csm3 can be divided into four distinct elements: the core (green) and lid domain (blue), a helical N-terminal (red), and a C-terminal domain (yellow). The structural elements of the core adopt a ferredoxin-like fold with β-α-β-β-α-β arrangement. The core is topologically interrupted by multiple insertions forming the lid and the helical N-terminal domain. The C-terminal domain packs against the core and is of mixed structural composition. The dashed blue line represents the missing disordered region between residues 200 and 214. The two views are related by a 180° rotation as indicated. (B) Topology diagram of Mk Csm3. Helices are represented as circles and β-strands as arrows. The secondary structure elements have been labeled numerically maintaining the nomenclature of RRM domains. The β-strands of the C-terminal domain extending the RRM β-sheet have also been labeled numerically. The additional α-helices have been labeled with letters (αA to αL). (C) A structural zinc ion present in the helical N-terminal domain is shown as a gray sphere, together with the coordinating residues (a cysteine and three histidine residues).

Figure 1. Structure of Methanopyrus kandleri Csm3. (A) The structure of Mk Csm3 can be divided into four distinct elements: the core (green) and lid domain (blue), a helical N-terminal (red), and a C-terminal domain (yellow). The structural elements of the core adopt a ferredoxin-like fold with β-α-β-β-α-β arrangement. The core is topologically interrupted by multiple insertions forming the lid and the helical N-terminal domain. The C-terminal domain packs against the core and is of mixed structural composition. The dashed blue line represents the missing disordered region between residues 200 and 214. The two views are related by a 180° rotation as indicated. (B) Topology diagram of Mk Csm3. Helices are represented as circles and β-strands as arrows. The secondary structure elements have been labeled numerically maintaining the nomenclature of RRM domains. The β-strands of the C-terminal domain extending the RRM β-sheet have also been labeled numerically. The additional α-helices have been labeled with letters (αA to αL). (C) A structural zinc ion present in the helical N-terminal domain is shown as a gray sphere, together with the coordinating residues (a cysteine and three histidine residues).

The secondary structure elements of the core are connected by loop regions ranging from 2–10 amino-acid residues (between β3-α2 and between α2-β4, respectively) or by larger insertions (between β1-α1, β2-β3, and α1-β2) (). The 35-residue long β1-α1 insertion contains a short β-hairpin and a one-turn α-helix (αA). On one side, it packs against the 45-residue long β2-β3 insertion, which also contains an α-helix (αG). On the other side, it packs against the α2-β4 loop. Overall, these interactions form the lid domain, which is positioned at the top of the β-sheet and is partially disordered (at a glycine-containing loop in the β2-β3 insertion).

The 100-residue long α1-β2 insertion contains five short α-helices (αB to αF) connected by extended segments (). This insertion forms the α-helical domain and wedges between the two helices (α1 and α2) of the core domain, near the short edge of the β-sheet (i.e., near β2). The helical domain binds a Zinc ion that is buried and is likely to have a structural role in stabilizing the fold of this domain (). It connects helices (αD to αE) and is coordinated by His86, Cys88, Cys115, and Cys118. Only the latter two residues are well conserved among Csm3 orthologs (Fig. S4A). However, other cysteine and histidine residues are present in the α1-β2 insertion of Csm3 from other species (Fig. S4A). It is thus possible that other Csm3 proteins might have a Zinc-binding domain in the corresponding region of the structure, albeit with a different topology.

Finally, the RRM-like domain is followed by a C-terminal domain (αH-β6-αI-αJ-β5-αK-αL) (). The C-terminal domain extends the core β-sheet by two antiparallel β- strands (β5 and β6), which flank the long edge of the core domain β-sheet (at β4). It also contains three short α-helices and a long C-terminal α-helix. The C-terminal α-helix packs against α2, at the convex surface of the β-sheet. The short helices (αI and αJ) contact the lid and partly occlude the front surface of the β-sheet. In canonical RRM domains, this front surface features hydrophobic residues that are part of the so-called RNP1 and RNP2 motifs and that bind RNA.Citation35 However, Mk Csm3 lacks the typical solvent-exposed hydrophobic residues that bind RNA in canonical RRM domains. Thus, the Mk Csm3 RRM seems to fulfill a structural purpose similar to other previously reported examples.Citation36,Citation37

Structural comparison of Csm3 with the Cas proteins of the RAMP superfamily

We compared the structure of Mk Csm3 with those of Cas5, Cas6, and Cas7, which represent the three major groups of evolutionary distinct RRM-containing proteins in the RAMP (repeat-associated mysterious protein) superfamilyCitation15 (Fig. S2A). Bacillus halodurans (Bh) Cas5d (PDB ID: 4F3M) has two RRM-like domains adjacent to each other and functions as an endoribonuclease in the pre-processing of crRNA transcripts. The N-terminal RRM-like domain contains the putative endoribonuclease site, which is centered at a histidine residue.Citation18,Citation31 Pyrococcus furiosus (Pf) Cas6 (PDB ID: 3UFC) has an N-terminal RRM-like domain that packs against a twisted β-sheet domain.Citation21 This RRM-like domain contains an endoribonuclease site that is also centered at a histidine residue, although the exact position differs from that of Bh Cas5d. The similarity of Mk Csm3 with Bh Cas5d and Pf Cas6 is limited to the RRM-like domain (Fig. S2A). Using the structural alignment program SSM as implemented in Coot,Citation38 Mk Csm3 superposed with Bh Cas5d with an rmsd of 3.9 Å over 80 Cα atoms and with Pf Cas6 with an rmsd of 4.1 Å over 125 Cα atoms. No prominent histidine residue or possible catalytic triad is however apparent from these structural alignments of Csm3. Consistently, Mk Csm3 did not exhibit any prominent endonucleolytic activity with repeat RNA or precursor RNA substrates (data not shown).

Bioinformatic analyses have predicted that Csm3 belongs to the Cas7 family of RAMP proteins.Citation39 Superposition of Mk Csm3 with the Sulfolobus solfataricus (Sso) Cas7 (PDB ID: 3PS0) structure results in an rmsd of 4.2 Å over 110 Cα atoms. As with Cas5 and Cas6, the structural similarity with Cas7 is primarily at the RRM-like domain. However, Mk Csm3 shares significant overall architectural analogy with Cas7 (). In particular, the two proteins have a similar arrangement of domains around the RRM-like fold. Cas7 contains a lid domain, a (mostly) helical domain, and a C-terminal domain at equivalent structural positions as described above for Csm3 (). Although Cas7 is not a Zinc-binding protein and although the exact topological arrangement of secondary structure elements differs from Mk Csm3, the overall dimensions and shape of the two proteins is remarkably similar (). As Cas7 is a scaffold RNA-binding protein, we assessed whether Mk Csm3 might have similar RNA-binding properties.

Figure 2. Structural similarity between Csm3 and Cas7. (A) Sso Cas7 (PDB ID: 3PS0, rmsd: 4.2Å, blue) shares the highest structural homology with Mk Csm3 (gold) beyond the core domain (gray). Both proteins have a similar arrangement of auxiliary domains surrounding the RRM-like fold, as well as a conserved architecture of the C-terminal domain. (B) Topology diagram of Mk Csm3 and Sso Cas7 showing the connectivity of the RRM fold relative to the other domains. The topological arrangement of the insertions is similar in both proteins. Similarities in secondary structure elements are highest within the core and low in the auxiliary domains.

Figure 2. Structural similarity between Csm3 and Cas7. (A) Sso Cas7 (PDB ID: 3PS0, rmsd: 4.2Å, blue) shares the highest structural homology with Mk Csm3 (gold) beyond the core domain (gray). Both proteins have a similar arrangement of auxiliary domains surrounding the RRM-like fold, as well as a conserved architecture of the C-terminal domain. (B) Topology diagram of Mk Csm3 and Sso Cas7 showing the connectivity of the RRM fold relative to the other domains. The topological arrangement of the insertions is similar in both proteins. Similarities in secondary structure elements are highest within the core and low in the auxiliary domains.

Csm3 binds single-stranded RNAs in a sequence non-specific manner

We performed electrophoretic mobility shift assays (EMSA) with crRNA substrates that were generated by in vitro transcription (). These assays indicated that Mk Csm3 binds crRNAs (). The Mk crRNAs contain a highly conserved repeat sequence of 36 nucleotides that includes a predicted stable stem-loop of 16–18 nucleotides () and a highly conserved eight nucleotide AATGAAA(C/G) motif at the 5′ end (psi-tag). They also contain variable spacer sequences ranging from 40–50 nucleotides.Citation40 We dissected which parts of the crRNA are recognized by Mk Csm3. In gel-shift assays, Csm3 showed weak binding to processed and unprocessed repeat sequences (), but not to its stem-loop structure alone (Fig. S3A). We tested whether Mk Csm3 binds single-stranded RNA, which is present in part of the repeat sequence as well as in the variable spacer. In gel-shift assays, Csm3 bound a 15-mer polyU RNA or 15-mer polyA about 10 times stronger than the repeat sequence (). Thus, the length of the single-stranded RNA might affect the strength of the interaction with Mk Csm3. Mk Csm3 did not exhibit detectable RNA binding toward the 8 nt psi-tag in the gel-shift assays (; Fig. S3A). We conclude that Mk Csm3 binds single-stranded RNAs from 15 nucleotides onwards in an apparently sequence non-specific manner and that RNA structures impair binding. This suggests that the variable sequence of the crRNA is bound by Mk Csm3, rather than the structured and conserved repeat.

Figure 3. RNA-binding properties of Csm3. (A) Mk Csm3 binds to a physiological crRNA substrate (left panel). 32P-labeled crRNA transcripts were incubated in the absence or presence of 5 µM, 10 µM, and 20 µM Mk Csm3. (B) Electrophoretic mobility shift assays were performed with the respective [32P]-5′-end labeled RNAs and increasing concentrations of Mk Csm3 (0 µM, 1 µM, 30 µM, 100 µM). Mk Csm3 binds to single-stranded RNA substrates (lane 16–20) but not significantly to the repeat sequences (lanes 1–5 and 6–10). Binding to single-stranded RNA is dependent on length but not sequence (compare lanes 16–20 and 11–15). Weak binding of Mk Csm3 to processed and unprocessed repeat sequences (lanes 1–5 and 6–10, respectively) is likely attributed to the ssRNA overhangs. (C) Methanopyrus kandleri repeat sequence conservation and predicted RNA folding.

Figure 3. RNA-binding properties of Csm3. (A) Mk Csm3 binds to a physiological crRNA substrate (left panel). 32P-labeled crRNA transcripts were incubated in the absence or presence of 5 µM, 10 µM, and 20 µM Mk Csm3. (B) Electrophoretic mobility shift assays were performed with the respective [32P]-5′-end labeled RNAs and increasing concentrations of Mk Csm3 (0 µM, 1 µM, 30 µM, 100 µM). Mk Csm3 binds to single-stranded RNA substrates (lane 16–20) but not significantly to the repeat sequences (lanes 1–5 and 6–10). Binding to single-stranded RNA is dependent on length but not sequence (compare lanes 16–20 and 11–15). Weak binding of Mk Csm3 to processed and unprocessed repeat sequences (lanes 1–5 and 6–10, respectively) is likely attributed to the ssRNA overhangs. (C) Methanopyrus kandleri repeat sequence conservation and predicted RNA folding.

To identify the RNA-binding interface, we examined the surface features of Mk Csm3 in terms of charge distribution () and evolutionary conservation (). The lid domain contains a striking patch (1) of conserved and surface-exposed positively charged residues including Arg217, Arg263, and Arg267 (Fig. S4A). Another positively charged residue, Arg21, is located at the center of this patch and approaches the position of Sso Cas7 His160, a residue that has been shown to be important for RNA binding.Citation29 A single mutation of Mk Csm3 Arg21 to Ala abolished RNA binding in EMSA assays (). In the lid domain, the positively charged surface patch is near the disordered glycine-containing loop (). This loop is conserved () but does not appear to be involved in RNA binding, as its deletion did not show a significant change in the EMSA assay as compared with the wild-type (WT) protein (). Another striking surface patch (2) of Mk Csm3 is located at the interface between the lid domain and the helical domain. In particular, helix α1 exposes several conserved residues, including Pro50, Ser52, Ser53, and Arg57 (Fig. S4B). Mutation of this conserved surface patch (2), however, did not significantly impair RNA binding (Fig. S4C). We concluded that Csm3 uses the lid domain to bind single-stranded RNA. It is possible to envisage that the other conserved surface patches on Csm3 mediate other types of macromolecular interactions, including protein–protein interactions that form in Csm3-containing effector complexes.

Figure 4. Identification of Csm3 RNA-binding residues. (A) The structure of Csm3 is shown in surface representations, in the same orientations as in , colored according to electrostatic potential. Charged patches (blue) are present at the back of the lid domain as well as at the interface between the core and N-terminal helical domain. Negatively charged surfaces (red) are located along the front of the N-terminal insertion and cover the C-terminal domain. Two surface patches discussed in the text (patch 1 and 2) are indicated. (B) Corresponding surface representations of Csm3 colored according to conservation with the Csm3 family. The conservation is based on a comprehensive alignment (Fig. S4B). Increase in conservation is shown in increasingly darker shades (from to red). No or low conservation (white and yellow) is found in the N-terminal insertion and the C-terminal domain. Highly conserved residues (orange and red) are located within the lid (patch 1) and core domains (patch 2) and coincide with positively charged surfaces (A). (C) Sequence alignments of Csm3 orthologs in regions corresponding to surface patches 1 and 2 (A and B). Residues selected for mutation analysis are highlighted with red dots. The unstructured loop (H199-S214) replaced by a (GS)3 linker is represented as a dashed red line. (D) RNA binding of Csm3 mutants to a single-stranded RNA substrate U15. Wild-type (WT) protein and the double mutation within the core domain (patch2) bind with comparable affinity. Replacement of the unstructured loop (H199-S214) by a -(GS)3- linker does not impair binding, while the single mutation R21A has completely lost RNA binding ability at this condition. (E) Coomassie-stained 12% SDS-PAGE gel of the purified protein samples used in the assays.

Figure 4. Identification of Csm3 RNA-binding residues. (A) The structure of Csm3 is shown in surface representations, in the same orientations as in Figure 1A, colored according to electrostatic potential. Charged patches (blue) are present at the back of the lid domain as well as at the interface between the core and N-terminal helical domain. Negatively charged surfaces (red) are located along the front of the N-terminal insertion and cover the C-terminal domain. Two surface patches discussed in the text (patch 1 and 2) are indicated. (B) Corresponding surface representations of Csm3 colored according to conservation with the Csm3 family. The conservation is based on a comprehensive alignment (Fig. S4B). Increase in conservation is shown in increasingly darker shades (from to red). No or low conservation (white and yellow) is found in the N-terminal insertion and the C-terminal domain. Highly conserved residues (orange and red) are located within the lid (patch 1) and core domains (patch 2) and coincide with positively charged surfaces (A). (C) Sequence alignments of Csm3 orthologs in regions corresponding to surface patches 1 and 2 (A and B). Residues selected for mutation analysis are highlighted with red dots. The unstructured loop (H199-S214) replaced by a (GS)3 linker is represented as a dashed red line. (D) RNA binding of Csm3 mutants to a single-stranded RNA substrate U15. Wild-type (WT) protein and the double mutation within the core domain (patch2) bind with comparable affinity. Replacement of the unstructured loop (H199-S214) by a -(GS)3- linker does not impair binding, while the single mutation R21A has completely lost RNA binding ability at this condition. (E) Coomassie-stained 12% SDS-PAGE gel of the purified protein samples used in the assays.

Conclusions

CASCADE/Cmr/Csm complexes share common functionalities, as reflected in their similar composition of proteins. Proteins of the Cas7 family are the backbone of the Type I effector complexes and are involved in interactions with both crRNA and other Cas proteins.Citation19,Citation29,Citation32 Computational analyses predicted that Csm3 might fulfill the role of the backbone protein Cas7 in type III interference assemblies.Citation39 Here, we show that Mk Csm3 has indeed a remarkably similar architecture as compared with Sso Cas7. We found that the structural similarity involves not only the central RRM-like domain, but also insertions at equivalent structural positions in the RRM fold. At the sequence level, however, the two proteins have almost completely diverged.

In line with the structural similarity to Cas7, Mk Csm3 recognizes crRNA. We found that Csm3 binds to a variable sequence of ssRNA via the flexible insertion that forms a lid on top the RRM domain. The overall affinity toward RNA is significant yet not strong. It is abolished through mutation of an arginine residue (Arg21Ala) yet hardly reduced when mutating other conserved residues within the positively charged surfaces. It is possible however that this region contributes to RNA binding when in the context of a fully assembled Csm complex. Type III systems further process premature crRNA to mature crRNA; Csm3, together with Csm2 and Csm5, were reported to be required for crRNA 3′ termini maturation.Citation41 However, in our studies, we could not identify potential catalytic residues nor could we observe nucleolytic activity in biochemical assays. This is in agreement with S. epidermidis Csm3 studies indicating that crRNA maturation cleavage events are not performed by the Cas10/Csm complex.Citation34

Cas7 proteins oligomerize with a helical arrangement around the crRNA and interact with other Cas proteins of the effector complex, such as the Cas5 in Type I-A.Citation42 The S. solfataricus Cas7 protein was shown to be monomeric and is thought to require Cas5 and crRNA for nucleation and stabilization of its assembly.Citation29 In agreement, Mk Csm3 also behaves as a monomer in solution and might only oligomerize in the context of the Csm complex. It is possible that the insertion domains that surround the RRM and/or the RRM itself might provide interfaces for protein–protein interactions.Citation35 We note, however, in contrast to observations with bacterial S. epidermidis Csm3, we did not observe binding of RNA molecules in six nucleotide increments for Mk Csm3.Citation34 Our structural observations provide a first step toward the structural elucidation of the Csm proteins and their respective role in the surveillance complex. Additionally, the structure will contribute to characterizing the evolutionary relationship within the Cas7 protein family. Further tentative type III members of this family (Cmr1, Cmr4, Cmr6, Csm5)Citation39 remain to be analyzed and classified.

Experimental Procedures

Protein expression and purification

Mk Csm3 wild-type and mutant proteins were expressed as recombinant His- and His-SUMO-tagged fusion protein using BL21-Gold (DE3) Star pRARE (Stratagene) in TB medium and induced overnight at 18 °C. The cells were lysed in buffer A (50 mM Tris pH 7.5, 200 mM NaCl, 10% Glycerol) supplemented with 10 mM Imidazole, DNase, protease inhibitors (Roche) by sonication. Proteins (wild-type and mutants) were purified using Nickel-based affinity chromatography. The His-SUMO tag was cleaved by adding SUMO protease overnight. Proteins were further purified by size-exclusion chromatography (Superdex 75, GE Healthcare) in gel-filtration buffer (buffer A supplemented with 2 mM DTT). Point mutations were introduced by Quick Change site directed mutagenesis according to the manufacturer’s instruction (Stratagene).

Crystallization, data collection, structure determination, and analysis

Crystallization was performed at room temperature using hanging drop vapor diffusion method and equal volumes of the protein at 20 mg/ml (gel-filtration buffer) and of crystallization buffer (25% MPD and 50 mM MES 6.0). Crystals were both flash-frozen directly from the crystallization drop as well as subjected to further dehydration (increasing amounts of MPD up to 60%) and diffracted beyond 2.4 Å.

All diffraction data was collected at 100 K at the beamline PXII of the Swiss Light Source (SLS) synchrotron and processed using XDS.Citation43 The structures were determined using the native data and Zn-SAD phases to build an initial model. This was then used as a search model for molecular replacement of higher resolution data using Phaser.Citation44 Model building was performed manually with the program CootCitation38 and refined with PHENIX.Citation45 The data collection and refinement statistics are summarized in . Figures were prepared using PyMOL (http://www.pymol.org).

Biochemical assays

The RNA molecules U15, A10, A15, A20, A40 were synthesized (Purimex). The crRNA (locus 5, spacer 5) was produced by in vitro run-off transcription and purified by elution of the crRNA transcript from a polyacrylamide gel as described.Citation10 The RNA molecules were 5′-labeled with T4 polynucleotide kinase (New England Biolabs) and γ-[32P] ATP (Perkin- Elmer).

For the gel-shift assays, 0.5 pmol labeled RNA was mixed with 1 µM, 10 µM, 30 µM, 100 µM protein in a 10 µL reaction containing 20 mM Hepes at pH 7.5, 100 mM KOAc, 4 mM Mg(OAc)2, 0.1% (vol/vol) NP-40, and 2 mM DTT. Fifteen ng/µL (500 fmol/µl = 500x molar excess) yeast tRNA mix (Amicon) were used as non-specific and 15 ng/µl unlabeled crRNA transcripts were used as specific competitor molecules. The mixtures were incubated for 20 min at 55 °C before adding 2 µL 50% (vol/vol) glycerol containing 0.25% (wt/vol) xylene cyanole. Samples were run on a 8% (wt/vol) polyacrylamide gel at 4 °C and visualized by phospho-imaging (GE Healthcare).

Supplemental material

Additional material

Download Zip (2.4 MB)

Disclosure of Potential Conflicts of Interest

No potential conflicts of interest were disclosed.

Acknowledgments

The authors would like to thank Sutapa Chakrabarti and Marco Hein for critical reading of the manuscript and Jérôme Basquin, Karina Valer-Saldaña, and Sabine Pleyer at the MPI- Martinsried Crystallization Facility. The authors thank the staff of the PXII beamline at the Swiss Light Source for assistance during data collection. This study was supported by the Max Planck Gesellschaft, the DFG Research Group 1680 (FOR1680) to EC and LR, CIPSM to EC and the Schering Foundation fellowship to AH. Author contributions: AH, AS, and LR designed the experiments; AS and JE; crystallized the WT protein; AH and CB solved the structure; AS performed the experiment in ; AH performed all other experiments. AH, EC, and LR wrote the manuscript.

10.4161/rna.26500

Accession Number

The coordinates and the structure factors have been deposited to the Protein Data Bank with the accession code: 4N0L

Supplemental Materials

Supplemental materials may be found here: www.landesbioscience.com/journals/rnabiology/article/26500

References

  • Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nat Rev Microbiol 2010; 8:317 - 27; http://dx.doi.org/10.1038/nrmicro2315; PMID: 20348932
  • Makarova KS, Wolf YI, Koonin EV. Comparative genomics of defense systems in archaea and bacteria. Nucleic Acids Res 2013; 41:4360 - 77; http://dx.doi.org/10.1093/nar/gkt157; PMID: 23470997
  • Mojica FJ, Díez-Villaseñor C, García-Martínez J, Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J Mol Evol 2005; 60:174 - 82; http://dx.doi.org/10.1007/s00239-004-0046-3; PMID: 15791728
  • Mojica FJ, Díez-Villaseñor C, Soria E, Juez G. Biological significance of a family of regularly spaced repeats in the genomes of Archaea, Bacteria and mitochondria. Mol Microbiol 2000; 36:244 - 6; http://dx.doi.org/10.1046/j.1365-2958.2000.01838.x; PMID: 10760181
  • Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, Romero DA, Horvath P. CRISPR provides acquired resistance against viruses in prokaryotes. Science 2007; 315:1709 - 12; http://dx.doi.org/10.1126/science.1138140; PMID: 17379808
  • Marraffini LA, Sontheimer EJ. CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea. Nat Rev Genet 2010; 11:181 - 90; http://dx.doi.org/10.1038/nrg2749; PMID: 20125085
  • Wiedenheft B, Sternberg SH, Doudna JA. RNA-guided genetic silencing systems in bacteria and archaea. Nature 2012; 482:331 - 8; http://dx.doi.org/10.1038/nature10886; PMID: 22337052
  • Carte J, Wang R, Li H, Terns RM, Terns MP. Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes. Genes Dev 2008; 22:3489 - 96; http://dx.doi.org/10.1101/gad.1742908; PMID: 19141480
  • Haurwitz RE, Jinek M, Wiedenheft B, Zhou K, Doudna JA. Sequence- and structure-specific RNA processing by a CRISPR endonuclease. Science 2010; 329:1355 - 8; http://dx.doi.org/10.1126/science.1192272; PMID: 20829488
  • Richter H, Zoephel J, Schermuly J, Maticzka D, Backofen R, Randau L. Characterization of CRISPR RNA processing in Clostridium thermocellum and Methanococcus maripaludis. Nucleic Acids Res 2012; 40:9887 - 96; http://dx.doi.org/10.1093/nar/gks737; PMID: 22879377
  • Marraffini LA, Sontheimer EJ. CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA. Science 2008; 322:1843 - 5; http://dx.doi.org/10.1126/science.1165771; PMID: 19095942
  • Grissa I, Vergnaud G, Pourcel C. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res 2007; 35:Web Server issue W52-7; http://dx.doi.org/10.1093/nar/gkm360; PMID: 17537822
  • Jansen R, Embden JD, Gaastra W, Schouls LM. Identification of genes that are associated with DNA repeats in prokaryotes. Mol Microbiol 2002; 43:1565 - 75; http://dx.doi.org/10.1046/j.1365-2958.2002.02839.x; PMID: 11952905
  • Makarova KS, Grishin NV, Shabalina SA, Wolf YI, Koonin EV. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol Direct 2006; 1:7; http://dx.doi.org/10.1186/1745-6150-1-7; PMID: 16545108
  • Makarova KS, Haft DH, Barrangou R, Brouns SJ, Charpentier E, Horvath P, Moineau S, Mojica FJ, Wolf YI, Yakunin AF, et al. Evolution and classification of the CRISPR-Cas systems. Nat Rev Microbiol 2011; 9:467 - 77; http://dx.doi.org/10.1038/nrmicro2577; PMID: 21552286
  • Bhaya D, Davison M, Barrangou R. CRISPR-Cas systems in bacteria and archaea: versatile small RNAs for adaptive defense and regulation. Annu Rev Genet 2011; 45:273 - 97; http://dx.doi.org/10.1146/annurev-genet-110410-132430; PMID: 22060043
  • Hale CR, Zhao P, Olson S, Duff MO, Graveley BR, Wells L, Terns RM, Terns MP. RNA-guided RNA cleavage by a CRISPR RNA-Cas protein complex. Cell 2009; 139:945 - 56; http://dx.doi.org/10.1016/j.cell.2009.07.040; PMID: 19945378
  • Garside EL, Schellenberg MJ, Gesner EM, Bonanno JB, Sauder JM, Burley SK, Almo SC, Mehta G, MacMillan AM. Cas5d processes pre-crRNA and is a member of a larger family of CRISPR RNA endonucleases. RNA 2012; 18:2020 - 8; http://dx.doi.org/10.1261/rna.033100.112; PMID: 23006625
  • Nam KH, Haitjema C, Liu X, Ding F, Wang H, DeLisa MP, Ke A. Cas5d protein processes pre-crRNA and assembles into a cascade-like interference complex in subtype I-C/Dvulg CRISPR-Cas system. Structure 2012; 20:1574 - 84; http://dx.doi.org/10.1016/j.str.2012.06.016; PMID: 22841292
  • Mulepati S, Bailey S. Structural and biochemical analysis of nuclease domain of clustered regularly interspaced short palindromic repeat (CRISPR)-associated protein 3 (Cas3). J Biol Chem 2011; 286:31896 - 903; http://dx.doi.org/10.1074/jbc.M111.270017; PMID: 21775431
  • Wang R, Preamplume G, Terns MP, Terns RM, Li H. Interaction of the Cas6 riboendonuclease with CRISPR RNAs: recognition and cleavage. Structure 2011; 19:257 - 64; http://dx.doi.org/10.1016/j.str.2010.11.014; PMID: 21300293
  • Reeks J, Naismith JH, White MF. CRISPR interference: a structural perspective. Biochem J 2013; 453:155 - 66; http://dx.doi.org/10.1042/BJ20130316; PMID: 23805973
  • Wiedenheft B, Lander GC, Zhou K, Jore MM, Brouns SJ, van der Oost J, Doudna JA, Nogales E. Structures of the RNA-guided surveillance complex from a bacterial immune system. Nature 2011; 477:486 - 9; http://dx.doi.org/10.1038/nature10402; PMID: 21938068
  • Wiedenheft B, Zhou K, Jinek M, Coyle SM, Ma W, Doudna JA. Structural basis for DNase activity of a conserved protein implicated in CRISPR-mediated genome defense. Structure 2009; 17:904 - 12; http://dx.doi.org/10.1016/j.str.2009.03.019; PMID: 19523907
  • Zhu X, Ye K. Crystal structure of Cmr2 suggests a nucleotide cyclase-related enzyme in type III CRISPR-Cas systems. FEBS Lett 2012; 586:939 - 45; http://dx.doi.org/10.1016/j.febslet.2012.02.036; PMID: 22449983
  • Osawa T, Inanaga H, Numata T. Crystal Structure of the Cmr2-Cmr3 Subcomplex in the CRISPR-Cas RNA Silencing Effector Complex. J Mol Biol 2013; Forthcoming http://dx.doi.org/10.1016/j.jmb.2013.03.042; PMID: 23583914
  • Cocozaki AI, Ramia NF, Shao Y, Hale CR, Terns RM, Terns MP, Li H. Structure of the Cmr2 subunit of the CRISPR-Cas RNA silencing complex. Structure 2012; 20:545 - 53; http://dx.doi.org/10.1016/j.str.2012.01.018; PMID: 22405013
  • Zhang J, Rouillon C, Kerou M, Reeks J, Brugger K, Graham S, Reimann J, Cannone G, Liu H, Albers SV, et al. Structure and mechanism of the CMR complex for CRISPR-mediated antiviral immunity. Mol Cell 2012; 45:303 - 13; http://dx.doi.org/10.1016/j.molcel.2011.12.013; PMID: 22227115
  • Lintner NG, Kerou M, Brumfield SK, Graham S, Liu H, Naismith JH, Sdano M, Peng N, She Q, Copié V, et al. Structural and functional characterization of an archaeal clustered regularly interspaced short palindromic repeat (CRISPR)-associated complex for antiviral defense (CASCADE). J Biol Chem 2011; 286:21643 - 56; http://dx.doi.org/10.1074/jbc.M111.238485; PMID: 21507944
  • van Duijn E, Barbu IM, Barendregt A, Jore MM, Wiedenheft B, Lundgren M, Westra ER, Brouns SJ, Doudna JA, van der Oost J, et al. Native tandem and ion mobility mass spectrometry highlight structural and modular similarities in clustered-regularly-interspaced shot-palindromic-repeats (CRISPR)-associated protein complexes from Escherichia coli and Pseudomonas aeruginosa. Mol Cell Proteomics 2012; 11:1430 - 41; http://dx.doi.org/10.1074/mcp.M112.020263; PMID: 22918228
  • Nam KH, Haitjema C, Liu X, Ding F, Wang H, DeLisa MP, Ke A. Cas5d protein processes pre-crRNA and assembles into a cascade-like interference complex in subtype I-C/Dvulg CRISPR-Cas system. Structure 2012; 20:1574 - 84; http://dx.doi.org/10.1016/j.str.2012.06.016; PMID: 22841292
  • Wiedenheft B, van Duijn E, Bultema JB, Waghmare SP, Zhou K, Barendregt A, Westphal W, Heck AJ, Boekema EJ, Dickman MJ, et al. RNA-guided complex from a bacterial immune system enhances target recognition through seed sequence interactions. Proc Natl Acad Sci U S A 2011; 108:10092 - 7; http://dx.doi.org/10.1073/pnas.1102716108; PMID: 21536913
  • Koonin EV, Makarova KS. CRISPR-Cas: evolution of an RNA-based adaptive immunity system in prokaryotes. RNA Biol 2013; 10:679 - 86; http://dx.doi.org/10.4161/rna.24022; PMID: 23439366
  • Hatoum-Aslan A, Samai P, Maniv I, Jiang W, Marraffini LA. A ruler protein in a complex for antiviral defense determines the length of small interfering CRISPR RNAs. J Biol Chem 2013; (Forthcoming)
  • Maris C, Dominguez C, Allain FH. The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression. FEBS J 2005; 272:2118 - 31; http://dx.doi.org/10.1111/j.1742-4658.2005.04653.x; PMID: 15853797
  • Fribourg S, Gatfield D, Izaurralde E, Conti E. A novel mode of RBD-protein recognition in the Y14-Mago complex. Nat Struct Biol 2003; 10:433 - 9; http://dx.doi.org/10.1038/nsb926; PMID: 12730685
  • Kadlec J, Izaurralde E, Cusack S. The structural basis for the interaction between nonsense-mediated mRNA decay factors UPF2 and UPF3. Nat Struct Mol Biol 2004; 11:330 - 7; http://dx.doi.org/10.1038/nsmb741; PMID: 15004547
  • Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 2004; 60:2126 - 32; http://dx.doi.org/10.1107/S0907444904019158; PMID: 15572765
  • Makarova KS, Aravind L, Wolf YI, Koonin EV. Unification of Cas protein families and a simple scenario for the origin and evolution of CRISPR-Cas systems. Biol Direct 2011; 6:38; http://dx.doi.org/10.1186/1745-6150-6-38; PMID: 21756346
  • Su AA, Tripp V, Randau L. RNA-Seq analyses reveal the order of tRNA processing events and the maturation of C/D box and CRISPR RNAs in the hyperthermophile Methanopyrus kandleri. Nucleic Acids Res 2013; 41:6250 - 8; http://dx.doi.org/10.1093/nar/gkt317; PMID: 23620296
  • Hatoum-Aslan A, Maniv I, Marraffini LA. Mature clustered, regularly interspaced, short palindromic repeats RNA (crRNA) length is measured by a ruler mechanism anchored at the precursor processing site. Proc Natl Acad Sci U S A 2011; 108:21218 - 22; http://dx.doi.org/10.1073/pnas.1112832108; PMID: 22160698
  • Jore MM, Lundgren M, van Duijn E, Bultema JB, Westra ER, Waghmare SP, Wiedenheft B, Pul U, Wurm R, Wagner R, et al. Structural basis for CRISPR RNA-guided DNA recognition by Cascade. Nat Struct Mol Biol 2011; 18:529 - 36; http://dx.doi.org/10.1038/nsmb.2019; PMID: 21460843
  • Kabsch W. XDS. Acta Crystallogr D Biol Crystallogr 2010; 66:125 - 32; http://dx.doi.org/10.1107/S0907444909047337; PMID: 20124692
  • McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. Phaser crystallographic software. J Appl Crystallogr 2007; 40:658 - 74; http://dx.doi.org/10.1107/S0021889807021206; PMID: 19461840
  • Adams PD, Afonine PV, Bunkóczi G, Chen VB, Davis IW, Echols N, Headd JJ, Hung LW, Kapral GJ, Grosse-Kunstleve RW, et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr 2010; 66:213 - 21; http://dx.doi.org/10.1107/S0907444909052925; PMID: 20124702