127
Views
0
CrossRef citations to date
0
Altmetric
Special issue: Precision medicine

A bioinformatic analysis of gene editing off-target loci altered by common polymorphisms, using ‘PopOff’

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Received 20 Dec 2023, Accepted 23 Apr 2024, Published online: 09 May 2024

ABSTRACT

Gene editing therapies are designed to minimise off-target editing. However, it is not widespread practice for common polymorphisms to be considered when identifying potential off-target sites in silico. Nevertheless, genetic variants should be included as they have the potential to alter existing, or to generate new, off-target sites. To facilitate the consideration of common polymorphisms when designing targeted gene therapies we developed PopOff, a web-based tool that integrates minor allele frequencies from the gnomAD variant database into an off-target analysis. We used PopOff to analyse predicted off-target loci from guide RNAs used in four clinical trials and thirty-four research publications. From an analysis of sixty guides, we identified that approximately 20% of off-target loci overlap with a common polymorphism. Of these sites, 6.93% contained variants that reduce the level of mismatch between the off-target locus and guide, and therefore may increase off-target cleavage. In addition, we identified that 0.34% of common polymorphisms generated novel PAM sites, resulting in off-target loci that standard workflows would miss. Our findings demonstrate that common polymorphisms should be considered when designing guides to maximise the safety of CRISPR-based gene therapies. However, this may be problematic in populations where the breadth of genetic diversity remains uncharacterised.

Introduction

Rare diseases affect an estimated three hundred million people worldwide, representing 3.5–5.9% of the global population (Nguengang Wakap et al. Citation2020). Approximately 80% of rare diseases are classified as single-gene disorders caused by an inherited or de novo variant in a single gene (Blencowe et al. Citation2018). These conditions are often associated with significant morbidity and mortality, including fragile skin conditions such as epidermolysis bullosa (Tang et al. Citation2021; du Rand et al. Citation2022) and blood disorders including sickle cell anaemia (Lubeck et al. Citation2019). Recent advances in CRISPR/Cas based genome editing now offer hope of permanent cures and therapies for which traditional pharmaceutical treatments have been elusive (Doudna Citation2020; Uddin et al. Citation2020; Katti et al. Citation2022).

The CRISPR/Cas editing system is named after the clustered regularly interspaced short palindromic repeats (CRISPR) of bacterial genomes that precipitated its discovery (Jinek et al. Citation2012). It is composed of two key components: a guide RNA (gRNA) molecule responsible for the programmability of the editing system, and the CRISPR-associated protein (Cas), which acts as an endonuclease. The gRNA forms a complex with the Cas protein which is then targeted to DNA complementary to the gRNA sequence. The Cas protein specifically cleaves DNA when a protospacer adjacent motif (PAM) sequence is found directly 3’ of the sequence complementary to the gRNA. The PAM sequence for the originally described S. pyogenes Type II system is a NGG consensus sequence, however, the PAM sequence differs between different Cas proteins and can contain variable bases.

CRISPR/Cas based gene therapies focus primarily on activating one of two endogenous DNA repair mechanisms: error-prone non-homologous end joining NHEJ, which can be utilised to introduce insertions and deletions (indels) at the cleavage site, and template-dependent homology-directed repair (HDR, which can be used to generate specific genetic alterations) (Yang et al. Citation2020; Shams et al. Citation2022). The chosen repair mechanism for gene therapy is informed by the disease and specific mutations for which the therapy is being developed. Several CRISPR/Cas-based treatments are currently being evaluated in clinical trials. So far, the treatment of HIV, genetic blindness (Sundaresan et al. Citation2023), cardiovascular disease (Naddaf Citation2023), β-thalassaemia (Fu et al. Citation2022) and transthyretin amyloidosis (Gillmore et al. Citation2021) are showing promise. The first CRISPR/Cas based therapy targeting sickle cell disease and transfusion-dependent beta thalassaemia was recently approved for use in the UK (Medicines and Healthcare products Regulatory Agency approval granted in November 2023) and USA (Food and Drud Administration approval granted in December 2023).

Although CRISPR guides are designed to target a single locus, it is possible for the system to cleave DNA when there is a sequence mismatch between the gRNA and DNA i.e. at an off-target site. Although cleavage at off-target sites tends to occur at significantly decreased rates compared to on-target cleavage rates (Fu et al. Citation2013; Hsu et al. Citation2013; Cho et al. Citation2014) it has a demonstrated potential to result in a variety of unintended genetic alterations, such as small indels or larger structural variants (Leenay et al. Citation2019; Liu et al. Citation2021; Hoijer et al. Citation2022; Kosicki et al. Citation2022; Wu et al. Citation2022; Hunt et al. Citation2023; Klermund et al. Citation2024). The factors that influence off-target cleavage are still being investigated, but a primary contributing factor appears to be similarity in the ‘seed sequence’ region of the gRNA, i.e. the 8–12 bases adjacent to the PAM site (Jinek et al. Citation2012; Jiang et al. Citation2015; Bertier et al. Citation2018). Off-target cleavage can be detected and quantified in an unbiased genome-wide manner using in vitro techniques such as CIRCLE-seq (Tsai et al. Citation2017), SITE-seq (Cameron et al. Citation2017), CHANGE-seq (Lazzarotto et al. Citation2020), and Nano-OTS (Hoijer et al. Citation2020). However, the results of in vitro assays can result in high false positive rates when compared to events observed in vivo (Tsai et al. Citation2017) and so may not be a suitable proxy for in vivo events (Hoijer et al. Citation2020; Cromer et al. Citation2023).

A common method for detecting off-target editing events in vivo is whole genome sequencing of edited cell clones, or targeted sequencing of pooled cells at off-target loci identified by in silico (computational) tools (Hunt et al. Citation2023). Popular in silico tools for off-target prediction include Cas-OFFinder (Bae et al. Citation2014), CRISPOR (Concordet and Haeussler Citation2018), and Off-Spotter (Pliatsika and Rigoutsos Citation2015). A limitation of these tools is that they use the human reference genome as the basis for their predictions. Whilst this reference genome is broadly representative of the average individual, the genomic sequence of each individual differs from the reference. Studies of human genetic variation have been compiled into several large population databases such as dbSNP (Sherry et al. Citation1999), 1000 Genomes (Genomes Project et al. Citation2015), and gnomAD (Karczewski et al. Citation2020). GnomAD (Karczewski et al. Citation2020) has identified over two hundred million unique variants from over 76,215 individuals, which, when accounting for the number of bases in the human genome equates to, on average, one variant occurring every 13.6 bases.

Common polymorphisms are important to consider when designing guide RNA for CRISPR/Cas based therapies as they have the potential to reduce on-target editing, enhance off-target cleavage, or to create a novel PAM site and thus introduce new off-target loci. The impact of common polymorphisms on off-target editing has been demonstrated previously. For example, one study found that a common germline variant (chr5:7924673G/C allele), which reduced mismatches at an off-target site from three two bases, to resulted in a 30-fold increase (1% to 36.7%) in indel formation at this locus (Yang et al. Citation2014). Another recent study used CHANGE-seq to estimate that Cas9 activity was affected by single nucleotide variations at 15.2% of analysed off-target sites, including editing occurring where none was previously observed (Lazzarotto et al. Citation2020). Additionally, studies have shown that common polymorphisms can generate novel NGG PAM sites across the genome and that these frequently occur within predicted off-target sites for guides (Lessard et al. Citation2017; Scott and Zhang Citation2017; Carneiro et al. Citation2022). Lessard et al. (Citation2017) performed an in-silico study using the 1000 genomes database and identified that common polymorphisms generate over 11 million novel NGG PAM sequences across the genome and can result in the destruction of over 22 million PAM NGG sites.

Despite the evidence that common polymorphisms can affect off-target hybridisation and cutting by CRISPR/Cas complexes, to date there has been minimal integration of these genetic variants into in silico prediction tools. Only two in silico off-target ‘variant aware’ prediction tools have been published, namely VARSCOT (Wilson et al. Citation2019) and CRISPRits (Cancellieri et al. Citation2020). Both tools require the input of a specific list of variants, rendering them most suitable for specific individual off-target prediction. Considering the potential impact of common polymorphisms on CRISPR/Cas based editing, we have developed PopOff, a tool for annotating the potential effect of genome-wide common polymorphisms on predicted off-target sites.

Methods

Guide selection

Guide sequences were selected from two sources: clinical trials and publications. The ClinicalTrials.gov database was searched between 1/2/2023 and 30/4/2023 to identify all clinical trials using CRISPR-mediated gene editing for treating human diseases. Trials were first selected using the search term ‘CRISPR’, which identified seventy-nine trials. All trials which did not target human DNA were excluded, along with trials involving chimeric antigen receptor t-cell therapy (as these trials were not directly administering editing reagents in vivo). These criteria identified four trials where an associated publication was available which provided gRNA sequences (either clinical or preclinical).

A literature search was performed between 1/2/2023 and 30/4/2023 of the PubMed database using the search terms ‘CRISPR’ and ‘Gene therapy’, ‘Genetic correction’, or ‘First in human’. Thirty-four research articles were identified which contained gRNA sequences, from which all provided guides were utilised for analysis.

The PopOff off-target loci screening tool

PopOff is a web-app based tool for the screening of predicted CRISPR off-target loci for the presence of population variants. It is available online at https://popoff.cloud.edu.au, or offline as a downloadable tool at https://github.com/chrisamson/PopOff. The tool infers off-target/variant overlaps with a simple comparison between a locus, guide, and a variant data base. This tool, written using R and a Shiny interface, takes two inputs from a user; the PAM sequence for their CRISPR editing system of choice and predicted off-target loci. Off-target loci can be provided in the form of an output from Cas-OFFinder, CRISPOR, or Off-Spotter, or alternatively as a tab or comma separated values. From the provided off-target loci file PopOff extract the coordinates of each locus (including chromosome, sense, start location, and end location for the GRCh38 reference genome), the gRNA sequence associated with the off-target locus, and the sequence of the locus.

Population variants that occur within or overlap with each off-target locus are identified from a subset of the gNomad database which includes all variants with a popmax frequency, or maximum allele frequency observed across all populations, of at least 0.1%. These variants are stored within an SQLite database and are indexed by genomic coordinate for easy retrieval by retrieving variants indexed to locations within the query off-target locus. Variants occurring over multiple reference bases are given duplicate entries in the database, one for each base they occur over, and are de-duplicated upon retrieval.

The identified variants are then annotated to the off-target loci they overlap, and the impact of the variant is inferred using the sequence information provided for the locus and the variant in question. This is performed by comparing the alternative allele of the variant to the off-target locus sequence and the gRNA or PAM sequence. For instance, if the alternative allele of a substitution variant matches the gRNA sequence but not the off-target locus sequence then the variant is classified as an enhancing guide substitution. Alternatively, if the substitution alternative allele does not match the gRNA or off-target locus sequences it is classified as an ablative guide substitution. A full list of effects and their definitions can be found in .

Table 1. A description of the predicted effects of population variants as determined by PopOff.

Results are provided as a table (which can be downloaded for later use) in addition to visualisations summarising the bulk results of all off-target loci provided.

Off-target prediction and analysis with PopOff

Off-target loci were identified using Cas-OFFinder (http://www.rgenome.net/cas-offinder/) (Bae et al. Citation2014), CRISPOR (http://crispor.tefor.net/) (Concordet and Haeussler Citation2018), and Off-Spotter (https://cm.jefferson.edu/Off-Spotter/) (Pliatsika and Rigoutsos Citation2015). Off-Spotter and CRISPOR were implemented through their online tools using the default recommended parameters. Cas-OFFinder off-target prediction was performed using the downloadable offline version (available at https://github.com/snugel/cas-offinder, version 2.4.1), limiting off-target prediction to loci with up to five mismatches. Permissive PAM sequence off-target analysis was performed using Cas-OFFinder using the same parameters while altering all PAM sites to replace all nucleotides with ‘N's. Permissive PAM sequence off-target analysis was not achievable for all guides for CRISPOR or Off-Spotter due to both the limitations in guide sequence length and/or the inability to use a PAM sequence other than ‘NGG’ and ‘NNGRRT’. Thus, only Cas-OFFinder was used for permissive PAM sequence off-target analysis. Off-target loci for the human GRCh38 reference genome were exported from each of the off-target finding tools in a tab-sep format and uploaded directly into the PopOff tool.

Predicted variant effect frequency distribution

Comparison of frequency distributions of variants for each of the effects predicted by PopOff (Ablative Guide Substitution, Enhancing Guide Substitution, Neutral Guide Substitution, Indel, PAM Loss, PAM Neutral, PAM Indel, Novel PAM, Non-PAM Enhancing Substitution, Non-PAM Ablative Substitution, Non-PAM Neutral Substitution, Non-PAM Indel) was performed via a permutation test using re-sampling with replacement. This allowed the comparison of the distribution of variants for a given predicted effect and all common polymorphisms. Re-sampling was performed 10,000 times with a sample size equal to that of the tested distribution (i.e. the number of population variants with each predicted effect). The median of the tested distribution was then compared to the medians of the 10,000 re-samples distributions to generate a p-value representing the likelihood that the two distributions are the same. These p-values were then corrected for multiple testing using the Holm–Bonferroni method.

Population bias analysis

To determine any potential population bias of the variants investigated, permutation resampling with replacement was also performed for each population described by gnomAD (African/African-American (Af/AfA), Ashkenazi Jewish (AJ), Amish (Am), East Asian (EA), European Finnish (Ef), European Non-Finnish (Enf), Latino/Admixed-American (L/AA), Middle Eastern (ME), Other Populations (Oth), and South Asian (SA)), with comparison of the distribution of variants for each population to all common polymorphisms.

The distributions of population frequency were extremely right-tailed, preventing typical Bayesian techniques. Therefore, population bias analysis was achieved by normalising population frequencies. Normalisation was performed by dividing the allelic frequency of each variant present in the different gnomAD-defined populations by its allelic frequency in the total population to give the relative allelic frequency in that population.

Expected occurrence calculation

To determine the probability that an individual will carry a variant at an off-target locus, the probability that a randomly selected individual carries a population variant overlapping with each off-target locus (p) was first calculated. This probability is a calculated as one minus the product of the complement of the allelic frequency for each overlapping variants (assuming no linkage disequilibrium); (p = 1-Product((1-AFVar_A) (1-AFVar_B), … ..)). For calculating the probability that an individual will carry a variant of a specific effect, the probability value calculation above is limited to only variants with the specific predicted effects.

Results

The PopOff tool

PopOff is an Rshiny-based (Shiny Citation2023) web-based tool that allows users to annotate potential off-target loci that may be affected by common polymorphisms contained in the gnomAD database. The tool is accessible at https://popoff.cloud.edu.au or as an offline downloadable tool at https://github.com/chrisamson/PopOff. Specifically, PopOff considers single nucleotide variants and indels listed in gnomAD v3.1.2 with a minimum population frequency (Popmax) of at least 0.1% (i.e. present in at least 1 in 1000 people within a specific population, known herein as ‘common polymorphisms’). The tool allows users to directly upload off-target loci for the human GRCh38 reference genome as identified by the commonly used in silico prediction tools CRISPOR, Cas-OFFinder, and Off-Spotter, or to provide the predicted off-target loci in a comma or tab-delimited format. Users can specify the PAM sequence used for analysis, allowing the tool to identify where common polymorphisms create or destroy off-target loci PAM sites for diverse Cas proteins. The tool then identifies the off-target chromosomal location (including the guide sequence and the PAM site) and retrieves common polymorphisms from an indexed database on a per locus basis. The effect of each population variant is then classified based on the nucleotide differences between the off-target locus and the original guide and PAM sequence, and the position of the variant within the off-target locus. The criteria and definitions for each classification can be found in .

Post analysis, the tool provides an interactive table where users can see the common polymorphisms affecting each off-target locus. The table provides the genomic location of each polymorphism and lists its potential impact on cleavage at the off-target locus. This is assessed based on the location of the variant within the off-target locus and the alteration that is generated, examples of which are listed in . Additionally, the table includes the minor allele frequency, maximum population frequency, and frequency of the variant in each gnomAD ancestry grouping. Tabs allow users to view just the variants predicted to enhance cleavage. In addition, PopOff provides summary figures displaying the proportion of off-target loci overlapping with common polymorphisms, the distribution of their predicted effects, and the distribution of their allelic frequency (both total and maximum population frequency) ( shows a representative results summary page). Graphs and tables can be downloaded for offline analysis.

Figure 1. A representative screen shot of a results summary page from PopOff displaying the predicted effect and population frequency of common polymorphisms identified within off-target loci.

Figure 1. A representative screen shot of a results summary page from PopOff displaying the predicted effect and population frequency of common polymorphisms identified within off-target loci.

Off-target loci are affected by common polymorphisms that are predicted to increase off-target cleavage

To assess the potential influence of common polymorphisms at off-target sites in the context of gene therapy, we used PopOff to analyse the predicted off-target sites of sixty gRNAs. These were identified from four clinical trials (seven guides) and 33 research studies (53 guides), all exploring the use of single guide CRISPR/Cas-based editing to treat genetic diseases (). Off-target loci were identified for all guides using the Cas-OFFinder tool. CRISPOR and Off-Spotter were also used for guides that conformed to their query requirements (i.e. limited to those 20 bp in length and using supported PAM sequences, 22 guides in total). For the sixty guides investigated, a total of 139,198 unique off-target loci were identified using all three tools (Supplementary Table 1). A comparison of identified off-target loci was performed for the 22 guides supported by all three tools and limited to loci with up to four mismatches because CRISPOR does not detect sites with more than four mismatches. A high concordance (91%) of identified loci was observed between Cas-OFFinder and CRISPOR, while Off-Spotter identified only 30% of the loci reported by the other two tools (A). Despite this discrepancy, PopOff reported a similar proportion of off-target loci that overlap with common polymorphisms for each tool (Cas-OFFinder 20.41%, CRISPOR 19.40%, and Off-Spotter 21.05%) (B).

Figure 2. Off-target loci statistics. A, A Venn diagram illustrating the degree of overlap of off-target loci identified by Cas-OFFinder, CRISPOR or Off-Spotter. B, The proportion of loci identified by PopOff that overlap with population variants using three off-target identifying tools. C, A graph showing the proportion of off-target loci that contained between one or up to 43 population variants. For each guide analysed (black circle), the percentage of off-target loci that contain a specific number of variants (as indicated on the x axis) is shown (each column contains data points representing each of the 60 analysed guides, and data points that equal zero are not shown). The average percentage for all guides is shown in red triangles. D, A violin plot showing the average distribution of population variants throughout the guide sequence. The nucleotide 5’ adjacent to the PAM region is numbered 1, with the seed sequence indicated between nucleotides 1 and 12 of the guide.

Figure 2. Off-target loci statistics. A, A Venn diagram illustrating the degree of overlap of off-target loci identified by Cas-OFFinder, CRISPOR or Off-Spotter. B, The proportion of loci identified by PopOff that overlap with population variants using three off-target identifying tools. C, A graph showing the proportion of off-target loci that contained between one or up to 43 population variants. For each guide analysed (black circle), the percentage of off-target loci that contain a specific number of variants (as indicated on the x axis) is shown (each column contains data points representing each of the 60 analysed guides, and data points that equal zero are not shown). The average percentage for all guides is shown in red triangles. D, A violin plot showing the average distribution of population variants throughout the guide sequence. The nucleotide 5’ adjacent to the PAM region is numbered 1, with the seed sequence indicated between nucleotides 1 and 12 of the guide.

Table 2. A list of the CRISPR guide RNA sequences used in the PopOff analysis.

As expected, the proportion of off-target loci containing multiple common polymorphisms decreases as variant number increases. For example, 16.73% of all loci were found to overlap with one population variant, 1.27% of all loci were found to overlap with two and 0.21% of all loci were found to overlap with three population variants (raw data in Supplemental Table 2). This is shown in C which details the proportion of just the polymorphism containing off-target loci containing one or more variant nucleotides. This shows that the majority (83.63% on average) overlapped with one population variant, reducing to on average just over 10% with two and declining there on. A small proportion of loci (on average <0.1%) overlapped with more than ten common polymorphisms, however, here all off-target sequences encompassed multiple unique alleles of polymorphic tandem repeat loci (primarily mononucleotide or dinucleotide repeats i.e, in one case there was just one off-target loci that contained 31 variants, C). We note that variants were on average distributed evenly through the guide and PAM regions of the off-target loci, with no major regional bias observed (D).

The impacts of variants predicted by PopOff to overlap with off-target sites were interrogated to assess their potential impact on cleavage. Variants that reduced the guide-target sequence complementarity and are therefore predicted to decrease the potential for cleavage were identified in 75.98% of off-target sites (i.e. an ablative guide substitution, indel, PAM loss or PAM indel – A). Variants that made a neutral change to the guide-target sequence complementarity and are therefore not predicted to change the potential for cleavage (i.e. a neutral guide substitution, PAM neutral) were identified in 17.10% of sites. However, 6.93% of variants resulted in a nucleotide substitution that was predicted to increase the likelihood of cleavage due to increased sequence complementarity between the guide and locus (i.e. an enhancing guide substitution) (A). We identified 26 loci where variants may reduce the number of mismatches from three to two and 251 loci where mismatches were reduced from four to three (enhancing guide substitutions, Supplementary Table 2). One site was identified where a population variant reduced the number of mismatches from one to zero. However, this site was the intended on-target editing location, and the variant in question was the targeted disease-causing population variant (Uchida et al. Citation2021). The population frequency and Popmax frequency of variants did not significantly differ between the different predicted effects (Supplementary Table 3), except for indels, which had significantly lower allelic frequencies (11.00% lower than all variants, p = 0.0105) and Popmax frequencies (10.10% lower, p = 0.0035) (Supplementary Table 3). Based on the population frequencies of off-target affecting variants and the rate they overlapped with off-target loci, we calculate that the average probability that an individual carries a population variant that overlaps with a predicted off-target locus (regardless of effect) to be 2.2% (or 1 in 44.6 individuals). In contrast, the average probability that an individual carries a variant that reduces the number of mismatches and therefore enhances the probability of off-target cleavage is 0.12% (or 1 in 823 individuals) (see methods section for analysis details).

Figure 3. The predicted effects of population variants on PAM containing (NGG) off-target sites. The predicted effect of population variants that overlap off-target loci for the 60 guides, as analysed using PopOff, is shown as a percentage of affected loci.

Figure 3. The predicted effects of population variants on PAM containing (NGG) off-target sites. The predicted effect of population variants that overlap off-target loci for the 60 guides, as analysed using PopOff, is shown as a percentage of affected loci.

Common polymorphisms can generate novel off-target sites

We next investigated the likelihood of common polymorphisms generating off-target loci by introducing novel PAM sites. The tool can only identify novel PAM sites when provided with potential off-target loci lacking PAM sequences. Therefore, Cas-OFFinder was used to predict off-target loci for all guide sequences analysed, this time using a permissive PAM sequence (all bases as ambiguous ‘N’s – Cas-OFFinder is the only tool that allows for a user-defined PAM sequence in this way). A total of 4,653,332 potential off-target loci were identified across the sixty guides (Supplementary Table 4). Using PopOff, 18.46% of these loci were identified to overlap with common polymorphisms, with 3.03% specifically containing a population variant within the PAM site region (data not shown).

The number of variants per predicted off-target loci and the position of the variants within the loci were similar to the PAM-containing analysis (see A–B). This ‘PAM-less’ analysis identified variants that sit in the PAM site, which therefore extended the ‘predicted consequences’ list (C). For example, PopOff could identify variants that altered a non-PAM sequence to another non-PAM sequence with less (Non-PAM Ablative Substitution, 2.91% of affected loci) or greater (Non-PAM Enhancing Substitution, 1.54%) or the same number of mismatches (Non-PAM Neutral Substitution, 8.71%).

Figure 4. The predicted effects of population variants on PAM permissive off-target sites. A, Graph showing the proportion of off-target loci that contained between one or up to 43 population variants. For each guide analysed (black circle), the percentage of off-target loci that contain a specific number of variants (x axis) is shown (each column contains data points representing each of the 60 analysed guides, and data points that equal zero are not shown). The average percentage for all guides is shown in red triangles. B, A violin plot demonstrating that population variants are evenly distributed throughout the guide. The nucleotide 5’ adjacent to the PAM region is numbered 1. C, The predicted effect of population variants that overlap off-target loci for the 60 guides, as analysed using PopOff, is shown as a percentage of affected loci.

Figure 4. The predicted effects of population variants on PAM permissive off-target sites. A, Graph showing the proportion of off-target loci that contained between one or up to 43 population variants. For each guide analysed (black circle), the percentage of off-target loci that contain a specific number of variants (x axis) is shown (each column contains data points representing each of the 60 analysed guides, and data points that equal zero are not shown). The average percentage for all guides is shown in red triangles. B, A violin plot demonstrating that population variants are evenly distributed throughout the guide. The nucleotide 5’ adjacent to the PAM region is numbered 1. C, The predicted effect of population variants that overlap off-target loci for the 60 guides, as analysed using PopOff, is shown as a percentage of affected loci.

Of note, 0.34% of the variants were observed to alter a non-PAM site to a valid novel PAM site (Novel PAM, C, and Supplementary Table 5). A small proportion (63/3594) of PAM-generating variants were observed to occur in up to 100% of the population. Although these are referred to as ‘population variants’ they actually reflect rare errors in the reference genome sequence – these are highlighted in green in Supplemental Table 5. These are unlikely to cause off target cutting due to high levels of mismatch in the guide region (a 3 base pair mismatch was the minimum here), although this effect will be locus dependent. However, three off-target loci generated by common polymorphisms contained only two mismatches ((3/3594) – Supplementary Table 5), which represents a higher risk of off-target cutting. We note that one of these variants has a relatively high minor allele frequency of 22.7% in the African/African American population.

Based on the observed rate that common polymorphisms affect off-target loci and their observed population frequencies, we calculated from this analysis (i.e. off-target loci identified when not considering the PAM site) that the average probability that an individual carries a population variant that overlaps with a predicted off-target locus is similar whether the locus includes a PAM sequence (2.1%) or not (2.2%). Additionally, the average probability that an individual carries a mismatch-reducing variant (that enhances cleavage) in a predicted off-target locus is 0.13% per mismatch between the off-target locus and the guide sequence (or 1 in 795 individuals). Finally, the average probability that an individual carries a novel PAM site-generating variant in a potential off-target locus is 0.0066% (or 1 in 15,045 individuals).

Similar to the observations made with off-target loci containing a PAM sequence, there was no significant difference observed between the distributions of allelic frequency (median 0.84%), and Popmax (median 3.23%) for the majority of variant impacts predicted by PopOff (Supplementary Table 6). The exceptions to this were Non-PAM Enhancing Guide Substitutions (allelic frequency 46.344% higher than the average, p ≤ 0.0001, Popmax 35.88% higher than the average, p ≤ 0.0001) and Non-PAM Ablative Guide Substitution (allelic frequency 18.02% lower than the average, p ≤ 0.0001, Popmax 15.60% lower than the average, p ≤ 0.0001). In addition, the Popmax for Ablative guide substitutions (1.59% lower than the average, p = 0.0027), indels (5.26% higher than the average, p < 0.0001), and Non-PAM indels (6.21% lower than the average, p = 0.0027) were significantly different.

PopOff identified population specific variant effects

We next wanted to ask if common polymorphisms impact off-target effects in some populations more than others. This might be expected as genetic variation relative to the reference genome differs between different populations. To investigate this, the relative allelic frequency (defined as the ratio of population allelic frequency to total allelic frequency) for each population variant affecting off-target loci was calculated for each population within the gnomAD database (A). Primary distribution peaks were observed to be approximately the same as the total population frequency (i.e. one times the relative allelic frequency or less) for all populations except African/African-American, where a primary peak was observed at approximately 3.4 times the relative allelic frequency (B). This suggests that for a given variant, the allelic frequency is approximately 3.4 times higher in the African/African-American population compared to all other populations.

Figure 5. PopOff identified population bias in allelic frequency. Violin plots indicating the relative allelic frequency, defined as population allelic frequency divided by total allelic frequency, of variants overlapping with off-target loci for all gnomAD-defined populations for variants affecting off-target loci with a defined (NGG) A, or permissive C, PAM sequence. Populations are African/African-American (Af/AfA), Ashkenazi Jewish (AJ), Amish (Am), East Asian (EA), European Finnish (Ef), European Non-Finnish (Enf), Latino/Admixed-American (L/AA), Middle Eastern (ME), Other Populations (Oth), and South Asian (SA). B and D, show the Af/AfA data at a finer scale.

Figure 5. PopOff identified population bias in allelic frequency. Violin plots indicating the relative allelic frequency, defined as population allelic frequency divided by total allelic frequency, of variants overlapping with off-target loci for all gnomAD-defined populations for variants affecting off-target loci with a defined (NGG) A, or permissive C, PAM sequence. Populations are African/African-American (Af/AfA), Ashkenazi Jewish (AJ), Amish (Am), East Asian (EA), European Finnish (Ef), European Non-Finnish (Enf), Latino/Admixed-American (L/AA), Middle Eastern (ME), Other Populations (Oth), and South Asian (SA). B and D, show the Af/AfA data at a finer scale.

Similar results were observed when the analysis was repeated for predicted off-target loci with a permissive PAM sequence (C and D). The primary peak in the African/African-American population was observed consistently across all predicted effects of variants, for predicted off-target loci identified with a defined or permissive PAM site (see Supplementary Figure 1). We note that the increase in allelic frequency is observed across all African/African-American variants, not just those that overlap off-target sites (Supplementary Figure 2).

Discussion

Using the PopOff tool, we analysed the impact of common polymorphisms on predicted off-target loci for sixty CRISPR guide sequences identified from four clinical trials and 34 published studies (). For the guides analysed, approximately one in five off-target loci (i.e. where a guide RNA is predicted to hybridise at an off-target site) overlap with a population variant. The frequent occurrence of common polymorphisms at predicted off-target loci has significant implications for guide design, given that the average probability that an individual will carry a variant in any given off-target locus is between 2.1 and 2.2%. Reassuringly, most common polymorphisms (∼92.13%) are predicted to introduce new mismatches between the guide and off-target locus, remove the PAM site, or to have no effect on the off-target locus. However, 6.93% of variants that coincide with a predicted off-target site reduced the level of mismatch between the off-target locus and the guide, thereby increasing the potential risk of off-target cleavage and unintended genomic editing events in patients.

From the sixty guides we examined, we identified 26 off-target loci where a population variant reduced the number of mismatches from 3 to 2. This level of reduction in mismatch has previously been demonstrated to increase off-target effects. For example, indel formation in an induced pluripotent cell line was increased from 1% to 36.7% (Yang et al. Citation2014) when a single nucleotide substitution was present (a heterozygous polymorphism that reduced the number of mismatches in an off-target site from 3 to 2). We note that variants that have the potential to affect off-target sites are not always rare. Some ‘population variants’ are commonly found within or across ethnic populations and reflect either errors in the reference genome or population specific variations. The most prevalent variant identified in this study that reduced the mismatch at an off-target site from 3 to 2 nucleotides had an average minor allele frequency of 85.6% (Supplementary Table 2, guide chr8:13335048-13335070, variant 8:13335053C>T, highlighted in yellow). Therefore, it is possible that unexpected off-target effects could occur in many treated patients. This highlights the need to account for common polymorphisms when designing guides for therapeutic use, or at least those that are known.

PopOff also identified common polymorphisms that generate novel PAM sites with the potential to induce editing at novel off-target loci (i.e. loci not predicted by standard off-target prediction tools). Such variants are of concern as, although rare, our analysis of sixty guides identified three variant specific off-target sites containing two mismatches (derived from three different guides), and 41 variant specific sites with three guide mismatches. These findings support those previously made by Lessard et al. (Citation2017) indicating that single nucleotide variations can generate millions of PAM sites genome-wide. The fact that PAM-generating and mismatch-reducing variants were observed with population frequencies of up to 100% (highlighted in green in Supplemental Tables 2 and 5) again highlights the importance of taking common polymorphisms into account when designing guides.

While PopOff can effectively screen off-target loci for population variants that may alter CRISPR editing efficacy it is limited, in its current form, to population variants affecting 50 bases or fewer. This precludes the identification of any impact of structural variants (variants affecting more than 50 bases) such as those included in the gNomad structural variant database (Collins et al. Citation2020) or in drafts of the human pangenome (Liao et al. Citation2023). Structural variants, like their smaller counterparts, may also alter off-target cleavage efficacy or generate novel off-target sites. However, to our knowledge there is nothing published on this topic to date, despite the clear risk that structural variants pose.

Off-target loci in this research were identified using three tools: Off-Spotter, CRISPOR, and Cas-OFFinder. While the loci identified by CRISPOR and Cas-OFFinder were largely congruent, Off-Spotter failed to identify more than two-thirds of the loci identified by the other two tools. While Cas-OFFinder searches reference genomes in real time both CRISPOR and Off-Spotter identify off-target loci using pre-generated 20mer indexes of reference genomes. The difference in performance between CRISPOR and Off-Spotter despite their similar methodologies is surprising and warrants further analysis with a larger and more diverse cohort of guides.

We discovered an atypical distribution of variants in the gnomAD African/African-American population, where the primary peak of variants was 3.4 times higher than the frequency of the same variants in the total human population. These results are not entirely unexpected and reflect the increased variation for the African/African-American population compared to the reference genome (Auton et al. Citation2015; Carneiro et al. Citation2022). Therefore, these populations are at a theoretical higher risk of unintended editing outcomes from CRISPR/Cas-based therapies compared to other populations. The disproportionate risk is of particular concern in the development of CRISPR/Cas-based treatments for conditions more prevalent in African populations, such as sickle-cell anaemia where African populations experience ∼75% of disease burden (Kato et al. Citation2018). For instance, we identified three novel potential off-target loci generated by common polymorphisms for three guides reported in a preclinical investigation of a gene therapy treatment for sickle-cell anaemia (chr3:210530659G>C, chr15:41523556C>G, chr15:98499274T>C, highlighted in yellow in Supplementary Table 5) (Uchida et al. Citation2021; Han et al. Citation2022). In all three cases, the common polymorphisms occurred at a frequency 3.4-3.5 times higher in the gnomAD African/African-American group compared to the total population. Ethnicity-associated differences in off-target sites highlights the importance of considering population specific variants during guide design to ensure equitable use of CRISPR/Cas-based gene editing. A major concern here is that we do not know the full breadth of polymorphisms present in many populations. This is especially true of Indigenous populations where genetic diversity is not fully characterised. For example, a recent analysis of the genomes of Indigenous peoples of Australia detected a high proportion of genetic variation not observed in global reference panels, and indicated that the full breadth of genetic diversity remains to be characterised (Silcocks et al. Citation2023). This raises concerns about limited and inequitable access to genomic precision medicines.

PopOff utilises variants from gnomAD with a Popmax frequency of at least 0.1%. This cut-off was selected as gnomAD contains over 640 million variants, nearly twenty times more than the subset of 35.7 million used by PopOff. Using the entire gnomAD database would result in inefficiently long workflow processing times and complicate web app hosting (due to data storage requirements). The subset of variants used by PopOff are expected to account for more than 99.8% of all individual variation, but each individual is also predicted to carry approximately 4000–5000 rare single nucleotide variants that are not present in the gnomAD database (Halachev et al. Citation2019). Thus, while PopOff can aid in assessing population-scale risk during guide design, it cannot fully predict the risk to each individual. To accurately predict off-target loci personalised for an individual, a genome-wide sequencing approach (e.g. whole genome sequencing) would need to be considered, providing a personalised off-target analysis. Whilst an individualised approach does present challenges, including the computational resources required to identify variants in whole-genome sequencing data, tools such as CRISPRits and VARSCOT have been developed to perform variant-aware prediction of off-target loci (Wilson et al. Citation2019; Cancellieri et al. Citation2020).

Conclusion

We have developed PopOff, a web-based tool for annotating CRISPR/Cas9 off-target sites containing common polymorphisms listed in gnomAD. Using this tool, we demonstrate that off-target loci frequently overlap with common polymorphisms, altering both the gRNA binding site and PAM site of the off-target locus. In some cases, increased sequence similarity is observed between gRNAs and off-target loci, or novel PAM sites are generated. While the median occurrence of these variants is relatively low, they can range from allelic frequencies of 0.01% to 100%. Additionally, we observed that the allelic frequency of variants with off-target effects were ∼3.4 times higher in the gnomAD African/African-American population than any other group. Off-target effects can be reduced by using high-fidelity Cas enzymes (Lee et al. Citation2018; Kim et al. Citation2023) or nickases which require two independent guide sites to generate double-stranded breaks (Trevino and Zhang Citation2014; Li and Margolis Citation2021). However, considering the data presented here, we also recommend incorporating common polymorphisms during guide design to enhance the safety of CRISPR-based therapies.

Supplemental material

Supplemental material

Download MS Excel (1.3 MB)

Acknowledgements

PopOff web hosting is provided by the Nectar Research Cloud, a service of the Australian Research Data Commons (ARDC). Samson devised the initial concept of the manuscript and composed the initial draft. The initial literature review was performed by Samson, Hunt, and du Rand. PopOff was developed by Samson. All authors contributed to the critical analysis of data and revisions of the draft manuscript, which were led by Sheppard. The final manuscript has been read and approved for publication by all authors.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Open Access funding enabled and organised by CAUL and its Member Institutions. Sampson was supported by an Auckland Medical Research Foundation project grant awarded to Sheppard, grant number 1120018. Whitford was supported by the Dawn Fellowship, administered by The Neurological Foundation, grant number 2115 DFE.

References

  • Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. 2015. A global reference for human genetic variation. Nature. 526(7571):68–74. doi:10.1038/nature15393.
  • Genomes Project C, Auton C, Brooks A, Durbin LD, Garrison RM, Kang EP, Korbel HM, Marchini JO, McCarthy JL, McVean S, et al. 2015. A global reference for human genetic variation. Nature. 526(7571):68–74. doi:10.1038/nature15393.
  • Bae S, Park J, Kim JS. 2014. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics. 30(10):1473–1475. doi:10.1093/bioinformatics/btu048.
  • Bchetnia M, Dionne Gagne R, Powell J, Morin C, Mccuaig C, Duperee A, Germain L, Tremblay JP, Laprise C. 2022. Allele-specific inactivation of an autosomal dominant epidermolysis bullosa simplex mutation using CRISPR-Cas9. CRISPR Journal. 5(4):586–597.
  • Benati D, Miselli F, Cocchiarella F, Patrizi C, Carretero M, Baldassarri S, Ammendola V, Has C, Colloca S, Del Rio M, et al. 2018. CRISPR/Cas9-mediated in situ correction of LAMB3 gene in keratinocytes derived from a junctional epidermolysis bullosa patient. Molecular Therapy. 26(11):2592–2603.
  • Bertier LD, Ron M, Huo H, Bradford KJ, Britt AB, Michelmore RW. 2018. High-resolution analysis of the efficiency, heritability, and editing outcomes of CRISPR/Cas9-induced modifications of NCED4 in Lettuce (Lactuca sativa). G3 (Bethesda). 8(5):1513–1521.
  • Blencowe H, Moorthie S, Petrou M, Hamamy H, Povey S, Bittles A, Gibbons S, Darlison M, Modell B. 2018. Rare single gene disorders: estimating baseline prevalence and outcomes worldwide. Journal of Community Genetics. 9(4):397–406. doi:10.1007/s12687-018-0376-2.
  • Bonafont J, Mencia A, Chacon-Solano E, Srifa W, Vaidyanathan S, Romano R, Garcia M, Hervas-Salcedo R, Ugalde L, Duarte B, et al. 2021. Correction of recessive dystrophic epidermolysis bullosa by homology-directed repair-mediated genome editing. Molecular Therapy. 29(6):2008–2018.
  • Bonafont J, Mencia A, Garcia M, Torres R, Rodriguez S, Carretero M, Chacon-Solano E, Modamio-Hoybjor S, Marinas L, Leon C, et al. 2019. Clinically relevant correction of recessive dystrophic epidermolysis bullosa by dual sgRNA CRISPR/Cas9-mediated gene editing. Molecular Therapy. 27(5):986–998.
  • Cameron P, Fuller CK, Donohoue PD, Jones BN, Thompson MS, Carter MM, Gradia S, Vidal B, Garner E, Slorach EM, et al. 2017. Mapping the genomic landscape of CRISPR-Cas9 cleavage. Nature Methods. 14(6):600–606. doi:10.1038/nmeth.4284.
  • Cancellieri S, Canver MC, Bombieri N, Giugno R, Pinello L. 2020. CRISPRitz: rapid, high-throughput and variant-aware in silico off-target site identification for CRISPR genome editing. Bioinformatics. 36(7):2001–2008. doi:10.1093/bioinformatics/btz867.
  • Carneiro P, de Freitas MV, Matte U. 2022. In silico analysis of potential off-target sites to gene editing for Mucopolysaccharidosis type I using the CRISPR/Cas9 system: Implications for population-specific treatments. PLoS One. 17(1):e0262299. doi:10.1371/journal.pone.0262299.
  • Cho SW, Kim S, Kim Y, Kweon J, Kim HS, Bae S, Kim JS. 2014. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Research. 24(1):132–141. doi:10.1101/gr.162339.113.
  • Collins RL, Brand H, Karczewski KJ, Zhao X, Alfoldi J, Francioli LC, Khera AV, Lowther C, Gauthier LD, Wang H, et al. 2020. A structural variation reference for medical and population genetics. Nature. 581(7809):444–451. doi:10.1038/s41586-020-2287-8.
  • Concordet JP, Haeussler M. 2018. CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens. Nucleic Acids Research. 46(W1):W242–W245. doi:10.1093/nar/gky354.
  • Croci S, Carriero ML, Capitani K, Daga S, Donati F, Frullanti E, Lamacchia V, Tita R, Giliberti A, Valentino F, et al. 2020. High rate of HDR in gene editing of p.(Thr158Met) MECP2 mutational hotspot. European Journal of Human Genetics. 28(9):1231–1242.
  • Cromer MK, Majeti KR, Rettig GR, Murugan K, Kurgan GL, Bode NM, Hampton JP, Vakulskas CA, Behlke MA, Porteus MH. 2023. Comparative analysis of CRISPR off-target discovery tools following ex vivo editing of CD34(+) hematopoietic stem and progenitor cells. Molecular Therapy. 31(4):1074–1087. doi:10.1016/j.ymthe.2023.02.011.
  • Dabrowska M, Juzwa W, Krzyzosiak WJ, Olejniczak M. 2018. Precise excision of the CAG tract from the huntingtin gene by Cas9 nickases. Frontiers in Neuroscience. 12:75.
  • Doudna JA. 2020. The promise and challenge of therapeutic genome editing. Nature. 578(7794):229–236. doi:10.1038/s41586-020-1978-5.
  • du Rand A, Hunt JMT, Feisst V, Sheppard HM. 2022. Epidermolysis bullosa: a review of the tissue-engineered skin substitutes used to treat wounds. Molecular Diagnosis & Therapy. 26(6):627–643. doi:10.1007/s40291-022-00613-2.
  • Fu B, Liao J, Chen S, Li W, Wang Q, Hu J, Yang F, Hsiao S, Jiang Y, Wang L, et al. 2022. CRISPR-Cas9-mediated gene editing of the BCL11A enhancer for pediatric beta(0)/beta(0) transfusion-dependent beta-thalassemia. Nature Medicine. 28(8):1573–1580. doi:10.1038/s41591-022-01906-z.
  • Fu Y, Foden JA, Khayter C, Maeder ML, Reyon D, Joung JK, Sander JD. 2013. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature Biotechnology. 31(9):822–826. doi:10.1038/nbt.2623.
  • Gillmore JD, Gane E, Taubel J, Kao J, Fontana M, Maitland ML, Seitzer J, O'Connell D, Walsh KR, Wood K, et al. 2021. CRISPR-Cas9 in vivo gene editing for transthyretin amyloidosis. New England Journal of Medicine. 385(6):493–502. doi:10.1056/NEJMoa2107454.
  • Ginn SL, Amaya AK, Liao SHY, Zhu E, Cunningham SC, Lee M, Hallwirth CV, Logan GJ, Tay SS, Cesare AJ, et al. 2020. Efficient in vivo editing of OTC-deficient patient-derived primary human hepatocytes. JHEP Reports. 2(1):100065.
  • Hainzl S, Peking P, Kocher T, Murauer EM, Larcher F, Del Rio M, Duarte B, Steiner M, Klausegger A, Bauer JW, et al. 2017. COL7A1 editing via CRISPR/Cas9 in recessive dystrophic epidermolysis bullosa. Molecular Therapy. 25(11):2573–2584.
  • Halachev M, Meynert A, Taylor MS, Vitart V, Kerr SM, Klaric L, Consortium SGP, Aitman TJ, Haley CS, Prendergast JG, et al. 2019. Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions. PLOS Genetics. 15(11):e1008480. doi:10.1371/journal.pgen.1008480.
  • Han Y, Tan X, Jin T, Zhao S, Hu L, Zhang W, Kurita R, Nakamura Y, Liu J, Li D, et al. 2022. CRISPR/Cas9-based multiplex genome editing of BCL11A and HBG efficiently induces fetal hemoglobin expression. European Journal of Pharmacology. 918:174788. doi:10.1016/j.ejphar.2022.174788.
  • He L, Wang S, Peng L, Zhao H, Li S, Han X, Habimana JD, Chen Z, Wang C, Peng Y. 2021. CRISPR/Cas9 mediated gene correction ameliorates abnormal phenotypes in spinocerebellar ataxia type 3 patient-derived induced pluripotent stem cells. Translational Psychiatry. 11(1):479.
  • Hoijer I, Emmanouilidou A, Ostlund R, van Schendel R, Bozorgpana S, Tijsterman M, Feuk L, Gyllensten U, den Hoed M, Ameur A. 2022. CRISPR-Cas9 induces large structural variants at on-target and off-target sites in vivo that segregate across generations. Nature Communications. 13(1):627. doi:10.1038/s41467-022-28244-5.
  • Hoijer I, Johansson J, Gudmundsson S, Chin CS, Bunikis I, Haggqvist S, Emmanouilidou A, Wilbe M, den Hoed M, Bondeson ML, et al. 2020. Amplification-free long-read sequencing reveals unforeseen CRISPR-Cas9 off-target activity. Genome Biology. 21(1):290. doi:10.1186/s13059-020-02206-w.
  • Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O, et al. 2013. DNA targeting specificity of RNA-guided Cas9 nucleases. Nature Biotechnology. 31(9):827–832. doi:10.1038/nbt.2647.
  • Hunt JMT, Samson CA, Rand AD, Sheppard HM. 2023. Unintended CRISPR-Cas9 editing outcomes: a review of the detection and prevalence of structural variants generated by gene-editing in human cells. Human Genetics. 142(6):705–720. doi:10.1007/s00439-023-02561-1.
  • Itoh M, Kawagoe S, Tamai K, Nakagawa H, Asahina A, Okano HJ. 2020. Footprint-free gene mutation correction in induced pluripotent stem cell (iPSC) derived from recessive dystrophic epidermolysis bullosa (RDEB) using the CRISPR/Cas9 and piggyBac transposon system. Journal of Dermatological Science. 98(3):163–172.
  • Izmiryan A, Ganier C, Bovolenta M, Schmitt A, Mavilio F, Hovnanian A. 2018. Ex Vivo COL7A1 correction for recessive dystrophic epidermolysis bullosa using CRISPR/Cas9 and homology-directed repair. Molecular Therapy Nucleic Acids. 12:554–567.
  • Jiang F, Zhou K, Ma L, Gressel S, Doudna JA. 2015. Structural Biology. A Cas9-guide RNA complex preorganized for target DNA recognition. Science. 348(6242):1477–1481. doi:10.1126/science.aab1452.
  • Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. 2012. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 337(6096):816–821. doi:10.1126/science.1225829.
  • Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alfoldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP, et al. 2020. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 581(7809):434–443. doi:10.1038/s41586-020-2308-7.
  • Kato GJ, Piel FB, Reid CD, Gaston MH, Ohene-Frempong K, Krishnamurti L, Smith WR, Panepinto JA, Weatherall DJ, Costa FF, et al. 2018. Sickle cell disease. Nature Reviews Disease Primers. 4:18010. doi:10.1038/nrdp.2018.10.
  • Katti A, Diaz BJ, Caragine CM, Sanjana NE, Dow LE. 2022. CRISPR in cancer biology and therapy. Nature Reviews Cancer. 22(5):259–279. doi:10.1038/s41568-022-00441-w.
  • Kim YH, Kim N, Okafor I, Choi S, Min S, Lee J, Bae SM, Choi K, Choi J, Harihar V, et al. 2023. Sniper2L is a high-fidelity Cas9 variant with high activity. Nature Chemical Biology. 19(8):972–980. doi:10.1038/s41589-023-01279-5.
  • Klermund J, Rhiel M, Kocher T, Chmielewski KO, Bischof J, Andrieux G, Gaz ME, Hainzl S, Boerries M, Cornu TI, et al. 2024. On- and off-target effects of paired CRISPR-Cas nickase in primary human cells. Molecular Therapy. doi:10.1016/j.ymthe.2024.03.006.
  • Kocher T, Bischof J, Haas SA, March OP, Liemberger B, Hainzl S, Illmer J, Hoog A, Muigg K, Binder HM, et al. 2021. A non-viral and selection-free COL7A1 HDR approach with improved safety profile for dystrophic epidermolysis bullosa. Molecular Therapy Nucleic Acids. 25:237–250.
  • Kosicki M, Allen F, Steward F, Tomberg K, Pan Y, Bradley A. 2022. Cas9-induced large deletions and small indels are controlled in a convergent fashion. Nature Communications. 13(1):3422. doi:10.1038/s41467-022-30480-8.
  • Lattanzi A, Camarena J, Lahiri P, Segal H, Srifa W, Vakulskas CA, Frock RL, Kenrick J, Lee C, Talbott N, et al. 2021. Development of beta-globin gene correction in human hematopoietic stem cells as a potential durable treatment for sickle cell disease. Science Translational Medicine. 13(598).
  • Lazzarotto CR, Malinin NL, Li Y, Zhang R, Yang Y, Lee G, Cowley E, He Y, Lan X, Jividen K, et al. 2020. CHANGE-seq reveals genetic and epigenetic effects on CRISPR-Cas9 genome-wide activity. Nature Biotechnology. 38(11):1317–1327. doi:10.1038/s41587-020-0555-7.
  • Lee JK, Jeong E, Lee J, Jung M, Shin E, Kim YH, Lee K, Jung I, Kim D, Kim S, et al. 2018. Directed evolution of CRISPR-Cas9 to increase its specificity. Nature Communications. 9(1):3048. doi:10.1038/s41467-018-05477-x.
  • Leenay RT, Aghazadeh A, Hiatt J, Tse D, Roth TL, Apathy R, Shifrut E, Hultquist JF, Krogan N, Wu Z, et al. 2019. Large dataset enables prediction of repair after CRISPR-Cas9 editing in primary T cells. Nature Biotechnology. 37(9):1034–1037. doi:10.1038/s41587-019-0203-2.
  • Lessard S, Francioli L, Alfoldi J, Tardif JC, Ellinor PT, MacArthur DG, Lettre G, Orkin SH, Canver MC. 2017. Human genetic variation alters CRISPR-Cas9 on- and off-targeting specificity at therapeutically implicated loci. Proceedings of the National Academy of Sciences. 114(52):E11257–E11266. doi:10.1073/pnas.1714640114.
  • Li PP, Margolis RL. 2021. Use of single guided Cas9 nickase to facilitate precise and efficient genome editing in human iPSCs. Scientific Reports. 11(1):9865. doi:10.1038/s41598-021-89312-2.
  • Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ, et al. 2023. A draft human pangenome reference. Nature. 617(7960):312–324. doi:10.1038/s41586-023-05896-x.
  • Liu M, Zhang W, Xin C, Yin J, Shang Y, Ai C, Li J, Meng FL, Hu J. 2021. Global detection of DNA repair outcomes induced by CRISPR-Cas9. Nucleic Acids Research. 49(15):8732–8742. doi:10.1093/nar/gkab686.
  • Liu X, Lillywhite J, Zhu W, Huang Z, Clark AM, Gosstola N, Maguire CT, Dykxhoorn D, Chen ZY, Yang J. 2021. Generation and genetic correction of USH2A c.2299delG mutation in patient-derived induced pluripotent stem cells. Genes (Basel). 12(6).
  • Lubeck D, Agodoa I, Bhakta N, Danese M, Pappu K, Howard R, Gleeson M, Halperin M, Lanzkron S. 2019. Estimated life expectancy and income of patients with sickle cell disease compared with those without sickle cell disease. JAMA Network Open. 2(11):e1915374. doi:10.1001/jamanetworkopen.2019.15374.
  • Maeder ML, Stefanidakis M, Wilson CJ, Baral R, Barrera LA, Bounoutas GS, Bumcrot D, Chao H, Ciulla DM, Dasilva JA, et al. 2019. Development of a gene-editing approach to restore vision loss in Leber congenital amaurosis type 10. Nature Medicine. 25(2):229–233.
  • Muthel S, Marg A, Ignak B, Kieshauer J, Escobar H, Stadelmann C, Spuler S. 2023. Cas9-induced single cut enables highly efficient and template-free repair of a muscular dystrophy causing founder mutation. Molecular Therapy Nucleic Acids. 31:494–511.
  • Naddaf M. 2023. First trial of ‘base editing’ in humans lowers cholesterol – but raises safety concerns. Nature. 623(7988):671–672. doi:10.1038/d41586-023-03543-z.
  • Nakamura T, Morishige S, Ozawa H, Kuboyama K, Yamasaki Y, Oya S, Yamaguchi M, Aoyama K, Seki R, Mouri F, et al. 2020. Successful correction of factor V deficiency of patient-derived iPSCs by CRISPR/Cas9-mediated gene editing. Haemophilia. 26(5):826–833.
  • Nguengang Wakap S, Lambert DM, Olry A, Rodwell C, Gueydan C, Lanneau V, Murphy D, Le Cam Y, Rath A. 2020. Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. European Journal of Human Genetics. 28(2):165–173. doi:10.1038/s41431-019-0508-0.
  • Osborn MJ, Newby GA, Mcelroy AN, Knipping F, Nielsen SC, Riddle MJ, Xia L, Chen W, Eide CR, Webber BR, et al. 2020. Base editor correction of COL7A1 in recessive dystrophic epidermolysis bullosa patient-derived fibroblasts and iPSCs. Journal of Investigative Dermatology. 140(2):338–347.
  • Petkovic I, Bischof J, Kocher T, March OP, Liemberger B, Hainzl S, Strunk D, Raninger AM, Binder HM, Reichelt J, et al. 2022. COL17A1 editing via homology-directed repair in junctional epidermolysis bullosa. Front Med (Lausanne). 9:976604.
  • Pliatsika V, Rigoutsos I. 2015. Off-Spotter: very fast and exhaustive enumeration of genomic lookalikes for designing CRISPR/Cas guide RNAs. Biology Direct. 10:4. doi:10.1186/s13062-015-0035-z.
  • Pohler M, Guttmann S, Nadzemova O, Lenders M, Brand E, Zibert A, Schmidt HH, Sandfort V. 2020. CRISPR/Cas9-mediated correction of mutated copper transporter ATP7B. PLoS One. 15(9):e0239411.
  • Rocca CJ, Rainaldi JN, Sharma J, Shi Y, Haquang JH, Luebeck J, Mali P, Cherqui S. 2020. CRISPR-Cas9 gene editing of hematopoietic stem cells from patients with Friedreich's ataxia. Molecular Therapy. Methods and Clinical Development. 17:1026–1036.
  • Scott DA, Zhang F. 2017. Implications of human genetic variation in CRISPR-based therapeutic genome editing. Nature Medicine. 23(9):1095–1101. doi:10.1038/nm.4377.
  • Shams F, Bayat H, Mohammadian O, Mahboudi S, Vahidnezhad H, Soosanabadi M, Rahimpour A. 2022. Advance trends in targeting homology-directed repair for accurate gene editing: an inclusive review of small molecules and modified CRISPR-Cas9 systems. Bioimpacts. 12(4):371–391. doi:10.34172/bi.2022.23871.
  • Sherry ST, Ward M, Sirotkin K. 1999. dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Research. 9(8):677–679. doi:10.1101/gr.9.8.677.
  • Shi J. 2022. Preliminary safety and efficacy results of edi001: an investigator initiated trial on CRISPR/Cas9-modified autologous CD34 hematopoietic stem and progenitor cells for patients with transfusion dependent β- thalassemia. Blood. 140:10652–10653.
  • Shin JW, Kim KH, Chao MJ, Atwal RS, Gillis T, Macdonald ME, Gusella JF, Lee JM. 2016. Permanent inactivation of Huntington's disease mutation by personalized allele-specific CRISPR/Cas9. Human Molecular Genetics. 25(20):4566–4576.
  • Shinkuma S, Guo Z, Christiano AM. 2016. Site-specific genome editing for correction of induced pluripotent stem cells derived from dominant dystrophic epidermolysis bullosa. Proceedings of the National Academy of Sciences of the United States of America. 113(20):5676–5681.
  • Shiny. 2023. https://shiny.posit.co/. [accessed 2023].
  • Silcocks M, Farlow A, Hermes A, et al. 2023. Indigenous Australian genomes show deep structure and rich novel variation. Nature. 624:593–601. doi:10.1038/s41586-023-06831-w.
  • Sundaresan Y, Yacoub S, Kodati B, Amankwa CE, Raola A, Zode G. 2023. Therapeutic applications of CRISPR/Cas9 gene editing technology for the treatment of ocular diseases. The FEBS Journal. 290(22):5248–5269. doi:10.1111/febs.16771.
  • Takashima S, Shinkuma S, Fujita Y, Nomura T, Ujiie H, Natsuga K, Iwata H, Nakamura H, Vorobyev A, Abe R, et al. 2019. Efficient gene reframing therapy for recessive dystrophic epidermolysis bullosa with CRISPR/Cas9. Journal of Investigative Dermatology. 139(8):1711–1721. e1714..
  • Tang JY, Marinkovich MP, Lucas E, Gorell E, Chiou A, Lu Y, Gillon J, Patel D, Rudin D. 2021. A systematic literature review of the disease burden in patients with recessive dystrophic epidermolysis bullosa. Orphanet Journal of Rare Diseases. 16(1):175. doi:10.1186/s13023-021-01811-7.
  • Trevino AE, Zhang F. 2014. Genome editing using Cas9 nickases. Methods in Enzymology. 546:161–174. doi:10.1016/B978-0-12-801185-0.00008-8.
  • Tsai SQ, Nguyen NT, Malagon-Lopez J, Topkar VV, Aryee MJ, Joung JK. 2017. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets. Nature Methods. 14(6):607–614. doi:10.1038/nmeth.4278.
  • Uchida N, Li L, Nassehi T, Drysdale CM, Yapundich M, Gamer J, Haro-Mora JJ, Demirci S, Leonard A, Bonifacino AC, et al. 2021. Preclinical evaluation for engraftment of CD34(+) cells gene-edited at the sickle cell disease locus in xenograft mouse and non-human primate models. Cell Reports Medicine. 2(4):100247. doi:10.1016/j.xcrm.2021.100247.
  • Uddin F, Rudin CM, Sen T. 2020. Crispr gene therapy: applications, limitations, and implications for the future. Frontiers in Oncology. 10:1387. doi:10.3389/fonc.2020.01387.
  • Wang DN, Wang ZQ, Jin M, Lin MT, Wang N. 2022. CRISPR/Cas9-based genome editing for the modification of multiple duplications that cause Duchenne muscular dystrophy. Gene Therapy. 29(12):730–737.
  • Wei R, Yang J, Cheng CW, Ho WI, Li N, Hu Y, Hong X, Fu J, Yang B, Liu Y, et al. 2022. CRISPR-targeted genome editing of human induced pluripotent stem cell-derived hepatocytes for the treatment of Wilson's disease. JHEP Reports. 4(1):100389.
  • Wilson LOW, Hetzel S, Pockrandt C, Reinert K, Bauer DC. 2019. VARSCOT: variant-aware detection and scoring enables sensitive and personalized off-target detection for CRISPR-Cas9. BMC Biotechnology. 19(1):40. doi:10.1186/s12896-019-0535-5.
  • Wu J, Zou Z, Liu Y, Liu X, Zhangding Z, Xu M, Hu J. 2022. CRISPR/Cas9-induced structural variations expand in T lymphocytes in vivo. Nucleic Acids Research. 50(19):11128–11137. doi:10.1093/nar/gkac887.
  • Xiao R, Zhou M, Wang P, Zeng B, Wu L, Hu Z, Liang D. 2022. Full-length dystrophin restoration via targeted exon addition in DMD-patient specific iPSCs and cardiomyocytes. International Journal of Molecular Sciences. 23(16).
  • Xu L, Wang J, Liu Y, Xie L, Su B, Mou D, Wang L, Liu T, Wang X, Zhang B, et al. 2019. CRISPR-Edited stem cells in a patient with HIV and acute lymphocytic leukemia. New England Journal of Medicine. 381(13):1240–1247. doi:10.1056/NEJMoa1817426.
  • Xu L, Yang H, Gao Y, Chen Z, Xie L, Liu Y, Liu Y, Wang X, Li H, Lai W, et al. 2017. CRISPR/Cas9-mediated Ccr5 ablation in human hematopoietic stem/progenitor cells confers HIV-1 resistance in vivo. Molecular Therapy. 25(8):1782–1789.
  • Xu P, Chen Z, Ma J, Shan Y, Wang Y, Xie B, Zheng D, Guo F, Song X, Gao G, et al. 2023. Biallelic CLCN2 mutations cause retinal degeneration by impairing retinal pigment epithelium phagocytosis and chloride channel function. Human Genetics. 142(4):577–593.
  • Yang H, Ren S, Yu S, Pan H, Li T, Ge S, Zhang J, Xia N. 2020. Methods favoring homology-directed repair choice in response to CRISPR/Cas9 induced-double strand breaks. International Journal of Molecular Sciences. 21:18.
  • Yang L, Grishin D, Wang G, Aach J, Zhang CZ, Chari R, Homsy J, Cai X, Zhao Y, Fan JB, et al. 2014. Targeted and genome-wide sequencing reveal single nucleotide variations impacting specificity of Cas9 in human stem cells. Nature Communications. 5:5507. doi:10.1038/ncomms6507.
  • Yun Y, Hong SA, Kim KK, Baek D, Lee D, Londhe AM, Lee M, Yu J, Mceachin ZT, Bassell GJ. 2020. CRISPR-mediated gene correction links the ATP7A M1311V mutations with amyotrophic lateral sclerosis pathogenesis in one individual. Communications Biology. 3(1):33.