Search in:

mAbs Volume 14, 2022 - Issue 1

Submit an article Journal homepage

Open access

3,841

Views

CrossRef citations to date

Altmetric

Listen

Report

In silico high throughput mutagenesis and screening of signal peptides to mitigate N-terminal heterogeneity of recombinant monoclonal antibodies

Xin Yua Drug Discovery Science and Technology, AbbVie Bioresearch Center, Worcester, MA, USCorrespondence[email protected]

https://orcid.org/0000-0002-8570-7069 View further author information

Merlinda Conyneb Drug Discovery Science and Technology, AbbVie Inc, North Chicago, IL, USView further author information

Marc R. Lakeb Drug Discovery Science and Technology, AbbVie Inc, North Chicago, IL, USView further author information

Karl A. Walterb Drug Discovery Science and Technology, AbbVie Inc, North Chicago, IL, USView further author information

Jing Mina Drug Discovery Science and Technology, AbbVie Bioresearch Center, Worcester, MA, USCorrespondence[email protected]
View further author information

Article: 2044977 | Received 18 Nov 2021, Accepted 17 Feb 2022, Published online: 11 Mar 2022

Cite this article
https://doi.org/10.1080/19420862.2022.2044977
CrossMark

In this article

ABSTRACT
Introduction
Results
Discussion
Materials and methods
Supplemental material
Acknowledgements
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

ABSTRACT

N-terminal heterogeneity resulting from non-uniform signal peptide (SP) cleavage can potentially affect biologics property attributes and result in extended product development timelines. Few studies are available on engineering SPs systematically to address miscleavage issues. Herein, we developed a novel high throughput computational pipeline capable of generating millions of SP mutant sequences that uses the SignalP 5.0 deep learning model to predict which of these mutants are likely to alleviate the N-terminal miscleavage in antibodies. We optimized the parameters to target mutating one or two amino acids at the C-terminus of 84 unique SPs, exhausting all theoretically possible combinations and resulting in a library of 296,077 unique wildtype and mutant signal peptides for in silico screening of each antibody. We applied this method to five antibodies against different targets, with various extent of miscleavage (2.3% to 100%) on their Lambda light chains. In each case, multiple SP mutants were generated, with miscleavage reduced to a non-detectable level and titers comparable with or better than that of the original SPs. Pairwise mutational analysis using an in silico library enriched with high-scoring mutants revealed patterns of amino acids at the C-terminus of SPs, providing insights beyond the “Heijne rule”. To our knowledge, no similar approach that combines high throughput in silico mutagenesis and screening with SP cleavage prediction has been reported in the literature. This method can be applied to both the light chain and heavy chain of antibodies, regardless of their initial extent of miscleavage, provides optimized solutions for individual cases, and facilitates the development of antibody therapeutics.

Abbreviations: Aa, amino acids; CHO, Chinese hamster ovary; CNN, convolutional neural network; CSscore, cleavage site score; CSV, comma-separated values; HC, heavy chain; HEK, human embryonic kidney; HPLC, high-performance liquid chromatography; IgG, immunoglobulin G; IGLV, immunoglobulin G Lambda variable; LC, light chain; LCMS, liquid chromatography–mass spectrometry; MS, mass spectrometry; PCR, polymerase chain reaction; PBS, phosphate-buffered saline; PEI, polyethylenimine; SP, signal peptide; SPase, signal peptidase; TCEP, tris(2-carboxyethyl) phosphine; TOF, time-of-flight.

KEYWORDS:

Signal peptides
N-terminal miscleavage
antibodies
deep learning
SignalP 5.0
in silico screening
high throughput mutagenesis

Introduction

The majority of secretory proteins in bacteria, Archaea, and Eukarya, as well as some transmembrane and intracellular proteins, carry a short peptide averaging 16–30 amino acids in length at the N-terminus.^{Citation1–5} This peptide, called the signal peptide (SP), serves as the address label marking the translocation and secretion pathways of premature proteins. In mature proteins, SPs are cleaved by one of the three types of signal peptidase (SPase) – type I, II, and IV.^Citation6 SPases bind primarily to the last three to seven residues located at the C-terminus of the SP (also known as the C-region). This region, along with the N-terminus of the mature protein, plays an important role in determining the cleavage sites.^Citation6

Due to their role in protein translocation and secretion, SPs have been widely used in the production of recombinant proteins. Efforts to engineer SPs are primarily focused on enhancing protein expression levels. Because SP cleavage is the rate-limiting step in protein secretion, cleavage efficiency affects production titer.^Citation7 To increase the titer of recombinant proteins in bacteria, yeast, or mammalian cells, a number of studies used methods that screened either small sets of native, heterologous SPs,^Citation8 or combinations of such SPs and promoters.^{Citation9–12} Other studies involved creating random or site-specific mutagenic libraries of SPs via PCR, and then testing the mutants for their ability to improve titer.^{Citation13–16}

A much less investigated aspect of SP engineering is creating mutant SPs to correct the N-terminal miscleavage of proteins. N-terminal miscleavage results in either truncation (amino acids missing from the N-terminus of the mature protein) or extension (overhangs of extra amino acids at the N-terminus of the mature protein), leading to undesired product heterogeneity that potentially affects the function of biologic products, especially antibodies. For example, in their developability assessment of a human IgG1 antibody produced by Chinese hamster ovary (CHO) cells, Gibson et al. discovered that the N-terminal truncation of the antibody Lambda light chain can be addressed by either replacing the original murine-derived SP with a human IGLV1 germline SP, or by mutating the N-terminus SYE motif of the light chain.^Citation17

Experimental analysis of SP cleavage sites through cloning, protein production, and liquid chromatography–mass spectrometry (LC-MS) requires extensive effort. To facilitate this process, in silico tools have been developed to predict SP cleavage sites. Examples include convolutional neural network (CNN) models DeepSig^Citation18 and SigUNet,^Citation19 sequence alignment model Signal-Blast, Bayesian classifier Signal-BNF,^Citation20 and dynamic Bayesian network model Philius.^Citation21 More recently, Signal-3 L 3.0 integrated CNN with self-attention and conditional random field to achieve robust performance.^Citation22 SignalP 5.0, a deep neural network model combined with conditional random field, consistently ranks among the top in benchmark studies across various organisms.^{Citation22–24} SignalP 5.0 was trained on 20,785 proteins, and is able to predict cleavage by type I and II SPases in the Sec and Tat secretory pathways (i.e., Sec/SPI, Sec/SPII, Tat/SPI). Optimized transfer learning conferred SignalP 5.0 the ability to predict on multiple organisms, including Eukarya, Archaea, Gram-positive and Gram-negative bacteria.^Citation23 SignalP 5.0, as well as its previous versions (3.0 and 4.0), have been used in a few studies to aid the screening of wildtype signal peptides for improved secretory production of recombinant proteins and antibody fragments in E. coli,^{Citation25,Citation26} yeast,^Citation27 and Leishmania torentolae.^Citation28

Despite these advances, very few studies have used in silico tools in a systematic way to engineer SPs that can address N-terminal miscleavage of proteins. In this study, we developed a novel high throughput computational pipeline that is capable of generating millions of SP mutants and uses SignalP 5.0 to screen and predict which of these mutants are likely to alleviate the N-terminal miscleavage in antibodies. This pipeline is highly efficient, capable of creating mutants, predicting the probability of correct SP cleavage, and analyzing/rank ordering the results in a batch. We optimized its parameters to target one or two amino acid positions at the C-terminus of each of the 84 unique wildtype SPs, most of which derived from human germline V-gene alleles, exhausting all theoretically possible combinations and resulting in a library of 296,077 unique wildtype SPs and their mutants for in silico screening of each input protein. We applied this method to five antibodies against different targets with various extents of N-terminal miscleavage. In each case, we obtained multiple SP mutants that successfully reduced miscleavage to a non-detectable level, while maintaining titers similar to or better than the original. Further mechanistic studies using an in silico library enriched with high-scoring SP mutants revealed patterns of amino acids at the C-terminus of SPs.

Results

Comparison of CS scores of heavy chains, Lambda light chains, and Kappa light chains from an in-house collection of antibodies

To identify the antibodies that likely have SP miscleavage issues, using the computational pipeline, we analyzed an in-house collection of antibodies (19,151 heavy chains; 11,250 Kappa chains; and 1,169 Lambda chains), with SPs commonly used for antibody production. The Cleavage Site (CS) score produced by the SignalP 5.0 model ranges from 0 to 1 and represents the predicted marginal probability that cleavage happens at a site. The higher the CS score, the higher the marginal probability that a cleavage happens at a specific site. For each residue in the input sequence, the SignalP 5.0 model produces a CS score, but a high CS score may not always represent the correct cleavage site. For the rest of the study, all mentions of CS score refer to the score at the correct cleavage site. As shown in Supp. , the CS scores of the heavy chain (median 0.97) and Kappa chains (median 0.95) were more tightly packed between 0.8 to 1, while the scores of the Lambda light chains (median 0.72) spread from 0.4 to 1. The result suggested that the Lambda light chains were more prone to miscleavage than Kappa or heavy chains.

Figure 1. Miscleavage and titer of SP mutants of mAb 1. A: Sunburst plot showing fraction of SP mutants having greater (Orange) or less (turquoise) percentage of miscleavage by MS, compared to that of the original SP. These are displayed in the inner circle. Among those with less miscleavage, percentage (calculated as fraction of all mutants tested) of mutants with non-detectable miscleavage (light green), or detectable but less miscleavage (light blue) are displayed in the outer circle. B: Bubble plot of CS score at the correct cleavage site vs percentage of miscleavage on MS. Color of the dots represents the original SP, single aa or double aa mutants. Size of the dots represents the number of miscleavage sites. The smallest dots have zero miscleavage product (i.e., 0% miscleaved). Dashed line represents a threshold of the CS score as explained in the text. C: Scatter plot of CS score at the correct cleavage site vs titer. Marker style represents the original SP, mutant SP with non-detectable, less but detectable, equal, or more miscleavage compared to that of the original SP. The original SP and the best mutant SP (non-detectable miscleavage and the highest titer) were labeled. Dash line represents the half of the titer of the original SP. Green dots above the dash line were mutants with non-detectable miscleavage and titers comparable or better than that of the original SP. D: Representative MS analysis of the original SP and the best mutant SP.

$Figure 1. Miscleavage and titer of SP mutants of mAb 1. A: Sunburst plot showing fraction of SP mutants having greater (Orange) or less (turquoise) percentage of miscleavage by MS, compared to that of the original SP. These are displayed in the inner circle. Among those with less miscleavage, percentage (calculated as fraction of all mutants tested) of mutants with non-detectable miscleavage (light green), or detectable but less miscleavage (light blue) are displayed in the outer circle. B: Bubble plot of CS score at the correct cleavage site vs percentage of miscleavage on MS. Color of the dots represents the original SP, single aa or double aa mutants. Size of the dots represents the number of miscleavage sites. The smallest dots have zero miscleavage product (i.e., 0% miscleaved). Dashed line represents a threshold of the CS score as explained in the text. C: Scatter plot of CS score at the correct cleavage site vs titer. Marker style represents the original SP, mutant SP with non-detectable, less but detectable, equal, or more miscleavage compared to that of the original SP. The original SP and the best mutant SP (non-detectable miscleavage and the highest titer) were labeled. Dash line represents the half of the titer of the original SP. Green dots above the dash line were mutants with non-detectable miscleavage and titers comparable or better than that of the original SP. D: Representative MS analysis of the original SP and the best mutant SP.$

Impact of the number of residues at the N-terminus of the mature protein on the CS score

Five antibodies against different targets were selected for this study (). These antibodies have unique Lambda light chains (Supp. ). These Lambda light chains demonstrated various extent of miscleavage (2.3% to 100% miscleaved) as confirmed by LC-MS. Before we conducted in silico mutagenesis of SPs to address the miscleavage issue, we first investigated the number of residues at the N-terminus of the Lambda chain to use in the input sequence. The maximum length of the input sequence (SP plus mature protein) allowed by the SignalP 5.0 model is 70 aa. Consequently, for each antibody, we tested a range of lengths of N-terminus of the mature Lambda chain, from 5 to 51 aa, capping the length of the entire input sequence at 70 aa. The results showed that the number of residues at the N-terminus of the protein affected the CS score, and this effect is most prominent with sequences shorter than 10 residues (Supp. ). Because SPs naturally have various lengths, for convenience of computation, in the rest of the study, we simply used the longest possible length (70 minus the length of the SP) for the mature protein in the input sequence.

Table 1. Summary of MS cleavage and production titers of the SP mutants tested

Download CSV Display Table

Figure 2. Miscleavage and titer of SP mutants of mAb 2. Descriptions for plots A, B, C, and D were the same as that in . In plot C, green dots circled in dash are the mutants with non-detectable miscleavage and greater than 2-fold improvement of titer, compared to the original SP.

Figure 2. Miscleavage and titer of SP mutants of mAb 2. Descriptions for plots A, B, C, and D were the same as that in Figure 1. In plot C, green dots circled in dash are the mutants with non-detectable miscleavage and greater than 2-fold improvement of titer, compared to the original SP.

Figure 3. Miscleavage and titer of SP mutants of mAb 3. Descriptions for plots A, B, C, and D were the same as that in . In plot C, green dots circled in dash are the mutants with non-detectable miscleavage and greater than 2-fold improvement of titer, compared to the original SP.

Figure 3. Miscleavage and titer of SP mutants of mAb 3. Descriptions for plots A, B, C, and D were the same as that in Figure 1. In plot C, green dots circled in dash are the mutants with non-detectable miscleavage and greater than 2-fold improvement of titer, compared to the original SP.

Percentage of mutant SPs in the in silico library with CS scores above 0.8

Previous studies on SP engineering involved manual selection and testing of mutants, which is largely a trial-and-error process.^Citation17 To demonstrate the possible outcomes of randomly generated SP mutants, using a collection of 84 unique wildtype SPs described in Supp Table 2, we introduced one or two mutations among the last five amino acids at the C-terminus of each SP, exhausting all theoretical combinations. The sequences were then filtered to retain only the unique ones, resulting in a final SP library of 295,993 SP mutants and 84 wildtype SPs. Using the computational pipeline, we screened each of the five Lambda light chains using the same SP library. By integrating the areas under the kernel density plots, which estimate probability distributions, we discovered that, among the 296,077 unique SP wildtype and mutants generated for each of five antibodies, only 2% to 27% (mean 11.8%) of SP mutants have CS scores greater than 0.8 (Supp. ). This suggests that most of the randomly designed SP mutants had medium to low CS scores, and therefore might still be prone to miscleavage. Another observation is that the CS score is dependent not only on the SP sequence, but also on the sequence of the N-terminus of the mature protein. Because the same SP library was used in screening, the only difference is the N-terminal sequence of the antibody light chains. This difference resulted in the various shapes of the kernel density plots, and consequently the different chances of finding high-scoring mutants. Notably, mAb 5 and mAb 2 antibodies had a similar N-terminus, where the first 23 amino acids were the same and the sequences differed only after that (Supp. ). Their kernel density curves overlapped with each other, and the probability of finding high-scoring mutants was 2% for both antibodies. Taken together, the data indicate that computationally ranking the randomly generated SP mutants is necessary, in order to sort out the small populations of the high-scoring mutants. This process needs to be performed for each protein with a unique N-terminus, because a high-scoring SP mutant in one case might not be applicable to another case.

Figure 4. Miscleavage and titer of SP mutants of mAb 4. Descriptions for plots A, B, C, and D were the same as that in .

Impact of the number of mutations on the CS score

We investigated the effect of the number of mutations (0, 1 or 2) on the predicted CS scores of SP mutants. In each of the five antibodies, the median CS score consistently decreased as the number of mutations increased (Supp. ), suggesting that in general, random mutations in the −4 to 0 region incur undesirable effects that reduce the predicted cleavage specificity of the SPs. For each antibody, we ranked all 296,077 wildtype and mutants descending by their CS scores at the correct cleavage site. No wildtype SP was found among the top 100 mutants in any of the five antibodies we tested. Interestingly, among the top 100 (0.03%) mutants, at least 95% of them were double amino acid (double aa) mutants, and the rest were single amino acid (single aa) mutants (Supp. ). This indicated that while double aa mutagenesis resulted in a lower CS score overall, it also produced the majority of the top scoring mutants.

Figure 5. Miscleavage and titer of SP mutants of mAb 5. Descriptions for plots A, B, C, and D were the same as that in .

Improvement of the CS score with single and double aa mutations

Two strategies are commonly used in mutagenesis. The first strategy involves conducting mutagenesis step by step. For example, wildtype SPs are screened first, then only the best wildtype SPs are moved forward as templates for creating single aa mutants. Subsequently, only the best single aa mutants are moved forward for the next round of mutagenesis. In the second strategy, all wildtype SPs and their theoretically possible single and double aa mutants are generated and screened. Here, we demonstrated that the second approach is preferred for in silico SP screening. In Supp. , four wildtype SPs were selected: sp_10, sp_15, VL3_3 l, and VL10_10a. CS scores from the wildtype SP, as well as its highest-scoring single aa mutant and its highest-scoring double aa mutant were plotted. SP sp_15 had the highest score among the selected wildtype SPs. As for sp_10, it had a lower score to start with, but its best double aa mutant was on par with that from sp_15. In addition, VL3_3I and VL10_10a had extremely low scores (< 0.06) as wildtypes. However, their best double aa mutants were able to reach scores between 0.71–0.98, a significant improvement compared to their wildtypes. The results suggest that it is possible to convert a low-scoring SP to a high-scoring one with one to two mutations. As a result, even though covering all theoretically possible mutants demands more computational time, it is still desirable as it avoids missing potentially high-scoring mutants.

Figure 6. Overview of the computational pipeline. A: The python package “sp” was developed and integrated with SignalP 5.0 executable to build the complete pipeline from mutant generation to prediction and analysis. B: Numbering scheme used in the “sp” package. Region from position −4 to 0 was used for creating single and double aa SP mutants. C: Illustration of the concept of duplicated mutants generated from the mutagenesis of unique wildtype SPs. Only one of the duplicated mutants was retained for analysis, so that the final output of 296,077 wildtypes and mutants contained only unique SP sequences.

Experimental validation of the computational pipeline

The five antibodies were expressed and characterized. Three of them (mAb 2, mAb 4, and mAb 5) were initially expressed using wildtype SP sp_13. mAb 3 was initially expressed using wildtype SP sp_12 and mAb 1 using wildtype SP VL3_3 r. The absolute value of the CS scores for these wildtype SPs appeared to have little correlation with the percentage of miscleavage observed on MS. For example, mAb 1 and mAb 5 had similar scores (0.479 vs 0.414). However, mAb 1 was nearly 100% miscleaved, while mAb 5 was only 2.3% miscleaved. In addition, mAb 3 and mAb 2 had very different scores (0.941 vs 0.415), but their percentages of miscleavage were similar (5.5% vs 7.4%) (). This can be explained by the fact that the SignalP 5.0 model was trained as a multi-label classifier (i.e., probability of cleavage at a specific site) rather than a regressor based on MS cleavage data. As a result, the absolute value of the CS score is not predictive of the actual percentage of the miscleavage on MS.

Because of this limitation, we adopted a method of relative comparison. Our hypothesis is that, for a given antibody, regardless of the absolute value of the CS score of the original SP, if we can generate a SP mutant with a CS score higher than that of the original SP (i.e., a marginal probability of cleavage closer to 1), then the SP mutant will be more likely to cleave at the correct location, and consequently the miscleavage events will be less likely to happen elsewhere. To verify this hypothesis, we ranked the mutants in descending order according to the CS scores. For each antibody, we selected some SP mutants from the top of this list (scores > 0.9), and then randomly picked SP mutants with scores in the 0.5–0.9 range. We expressed and characterized the five antibodies using a total of 82 SP mutants (14 for mAb 1, 19 for mAb 2, 21 for mAb 3, 14 for mAb 4, and 14 for mAb 5). All these mutants but one were double aa mutants.

The results are summarized in , and raw data provided in Supp. . For each antibody, we obtained multiple SP mutants (42% to 93% of the mutants tested) that successfully addressed the miscleavage issue, with no detectable miscleavage peaks on MS. Detailed analyses of the mutants screened for mAb 1, mAb 2, mAb 3, mAb 4, and mAb 5 are displayed in , respectively. Specifically, in all mutants screened for mAb 1 and mAb 4, the percentage of miscleavage was reduced compared to that of the original SP (). On the other hand, among the mutants screened for the other three antibodies, there was a mixture of reduced and increased miscleavage events ( and ).

In general, it was difficult to establish a universal threshold of the CS score, above which all clones showed non-detectable miscleavage. In some cases, a threshold could be established for one antibody, but it had to be modified for another antibody. For example, a cutoff score could be established at approximately 0.65 for mAb 1 (), but needed to be raised to 0.82 for mAb 5 () and 0.94 for mAb 4 (). On the other hand, it is difficult to establish such a threshold for mAb 2 and mAb 3 because there were some SP mutants that scored very high, but still showed various degrees of miscleavage (). Regardless, a trend was consistent across all five antibodies: it was more likely to find SP mutants with non-detectable miscleavage among the highest-scoring clones. For example, in this study, we tested mutants with CS scores ranging from 0.5 to 0.996. If we had only tested mutants with CS scores of at least 0.9, the success rate of identifying mutants with non-detectable miscleavage and titers comparable to or better than that of original SP would be much higher (, last 2 columns). As a result, our recommended approach is to select a panel of mutants with the top-ranked CS scores (e.g., the top 10 mutants) for production and testing. While this approach does not guarantee that all the selected mutants will have non-detectable miscleavage, it is likely that we can find at least a few mutants that have mitigated the miscleavage issue, as demonstrated by all five antibodies in this study.

Previous studies showed that changes in the SP sequence could potentially affect expression titer.^{Citation8–10} To investigate the effect of SP mutations on titer, we plotted the CS score versus titer for each clone. Little correlation was found between these two variables ( – 5C), which can be explained by the fact that the SignalP 5.0 model was not trained on titer, therefore the CS score does not provide information on titer. Nevertheless, for each antibody, we discovered multiple SP mutants that not only reduced miscleavage to a non-detectable level, but also maintained titers from transient production comparable (within twofold) to that of the original SP. These clones were represented by green circles above the dash lines in – 5C and summarized in . Two mutants for mAb 2 antibody (, circled in dash) and four mutants for mAb 3 antibody (, circled in dash) not only successfully addressed the miscleavage issue, but also improved the titers by more than twofold. Representative MS analysis for the original SP and the best mutant SP (highest titer and non-detectable miscleavage) were displayed for each antibody ( – 5D). These results clearly demonstrated the success of our method, which may potentially offer a universal mitigation platform.

To aid the selection of mutants for future studies, we analyzed titers of all 87 antibodies (including those with the original SPs). The mutants were grouped by the wildtype SPs from which they were derived. The fold change in titer was calculated by dividing the titer of a given antibody produced with a mutant SP, by the titer of the same antibody produced with its original SP. In this way, the difference among the antibodies was normalized. As shown in Supp. Figure 7, some SPs and their mutants appeared to generate lower titers on average (e.g., sp_12, VKII_A18), while some appeared to generate higher titers on average (e.g., sp_14, VH_3-53). Because the number of antibodies produced per SP is relatively small, additional studies are needed to elucidate how SP mutagenesis affects the titer of antibodies.

Pairwise mutational analysis of amino acid patterns at the C-terminus of SPs

The −4 to 0 region () of the SP has been reported to contain the binding sites of SPases^Citation6 and therefore plays an important role in determining the cleavage specificity. Previous efforts studying the pattern of amino acids in this region resulted in the “Heijne rule”, which states that position 0 (i.e., last amino acid) and position −2 (i.e., third amino acid from the last) favor non-charged amino acid.^{Citation29–32} This rule was derived from statistical analysis of a small number of functional SPs. In this study, we took a different approach to analyze the preferred amino acids in this region. We asked the question whether the top-scoring mutants in the double aa mutagenesis library preferred mutations at certain positions, and if so, what these mutations were. To answer this question, first, pairwise mutational analysis was conducted by calculating the percentage of each of the 10 possible pairs of locations for double aa mutations. The 10 location pairs were equally distributed among the 288,900 unique double aa SP mutants (Supp. Figure 8). However, when narrowing down to the top-scoring 100 (0.035%) double aa mutants, a preference was observed for mutation combinations primarily at positions (−1, 0), (−2, −1), and (−4, 0). For mAb 4 antibody, a strong preference was also observed for mutations at (−4, −2) position.

Next, we analyzed the distribution of 20 amino acids at (−1, 0), (−2, −1), (−4, 0), and (−4, −2) locations, among all double aa mutants and the top-scoring 800 (0.28%) double aa mutants. The top 800 double aa mutants were used for this analysis in order to cover the theoretical diversity by 2 fold because in the pairwise analysis, each position could have one of the 20 amino acids, therefore a total of 400 pairs of amino acids were possible. The results showed that, among all double aa mutants generated in silico, the distribution of amino acids was random. However, when narrowing down to the top 800 double aa mutants, preferences for certain amino acids were observed (Supp. Figure 9A through 9D). Specifically, only non-charged (i.e., aliphatic or polar) amino acids such as AGSPQ were enriched among the top 800 mutants at position 0. In addition, only non-charged amino acids such as ACGISTV were enriched at position −2. These observations were consistent with the “Heijne rule”. On the other hand, for position −1 and position −4, both of which were less studied in the literature, we observed enrichment of amino acids with different physicochemical properties, e.g., non-charged, negatively charged, positively charged, aromatic. These data indicated that position 0 and −2 preferred non-charged amino acids, while positions −1 and −4 were more promiscuous.

There were two notable phenomena. First, given a specific location pair, the enrichment trends were generally consistent across different antibodies, though the actual percentage of a specific amino acid pair might vary slightly. For example, in Supp. Fig 9A, similar sets of hotspots (A/G/S at position 0 paired with different amino acids at position −1) were found across different antibodies, though there were variations in the intensity of these hotspots. For mAb 2 and mAb 5, in addition to these hotspots, low levels of enrichment were also observed for P/Q at position 0 paired with different amino acids at position −1. Second, given a specific antibody, similar conclusions could be made on the enrichment at a specific location, across different location pairs. For example, for the mAb 1 antibody, Supp. Fig 9A (−1, 0) and Supp. Figure 9C (−4, 0) evaluated different location pairs, but revealed a similar conclusion for the pattern of enrichment at position 0 (preference for AGS). Taken together, the results suggested that the enrichment patterns hold regardless of location pairs or antibodies analyzed.

Discussion

Miscleavage of SPs of recombinant monoclonal antibodies is not uncommon, and it may contribute to product heterogeneity and complicate the development process. Mutagenesis of SPs can mitigate this risk, though it is a trial-and-error process. In this study, we demonstrated the utility of a novel computational pipeline integrated with the SignalP 5.0 deep learning model for high-throughput generation and screening of SP mutants to address the N-terminal miscleavage issue in antibodies. Five antibodies with various degrees of miscleavage (2.3% to 100% miscleaved) were selected for validating this approach. An in silico library of 296,077 unique wildtype and SP mutants was generated from single and double aa mutagenesis at the C-terminus of each of the 84 unique wildtype SPs, exhausting all theoretically possible combinations. For each antibody, we were able to identify multiple SP mutants that successfully reduced the miscleavage to a non-detectable level, while maintaining similar or better titers.

Lambda antibodies were chosen as the focus of this study for several reasons. First, Lambda light chains appear to be more prone to miscleavage than Kappa or heavy chains. Secondly, these antibodies had significant variation in the levels of miscleavage as determined by MS (2.3% to 100%). Success in each case shows that this method can be applied to antibodies regardless of their initial extent of miscleavage. Finally, these antibodies have different mature N-terminal sequence and target specificity. This demonstrates that the method is applicable to different antibodies. We have also applied it to solving the miscleavage of a heavy chain (data not shown).

There are several considerations when using this computational pipeline. First, full coverage of all theoretically possible mutants is important to avoid loss of diversity. Second, while the pipeline allows up to five mutations within the −4 to 0 region, in practice we recommend mutating one or two amino acids per SP in the first trial. This is because, as the number of mutations per SP increases, the overall CS score decreases, reflecting a surge of low-scoring mutants. At least in this study, we were able to obtain single and double aa mutants with undetectable miscleavage and good titers. Third, even though the SignalP 5.0 model outputs a CS score for each amino acid in the input sequence, only the CS score at the correct cleavage site (i.e., position 0 of the SP) is used for ranking the mutants. Ranking the mutants by CS scores other than that from the correct cleavage site can result in prioritizing mutants that have reduced miscleavage at one site, but increased miscleavage at other sites. Finally, because there is a greater chance of finding successful SP mutants among the highest-scoring ones, our recommended approach is to rank the SP mutants by their CS scores in descending order and select a panel of mutants from the top of the list for in vitro production and evaluation. As shown in , by prioritizing mutants with higher CS scores, mutants with non-detectable miscleavage and good titers are identified at higher rates. While this approach does not guarantee that all selected mutants will have non-detectable miscleavage products, it is likely that you can find some that have mitigated the miscleavage issue with minimal impact on the expression.

Two aspects of our study are novel. First, we used high throughput in silico mutagenesis and screening to optimize SPs for their cleavage specificity. Very few studies on engineering SP to address N-terminal miscleavage issues are available in the literature. Previous efforts primarily focused on mutating or truncating the N-terminus of the mature protein, or manually mutating one aa at a time for the SPs, or selecting a different wildtype SP. These methods have drawbacks. For example, mutations or truncations at the N-terminus of the mature protein might interfere with its function or increase its immunogenicity risk. In addition, functional studies need to be repeated for the mutant protein due to changes in its sequence, resulting in a delayed development timeline and requirement of additional resources. On the other hand, manually designing SP mutations has a relatively low chance of success. Studies have shown that the N-terminus of the mature protein is involved in SPase recognition.^Citation6 This implies that the optimal SP needs to be determined case-by-case in the context of specific mature protein sequences, thus there might not be a single solution that fits all. Our computational pipeline can address these concerns by performing large-scale, high throughput in silico screening of SP mutants. It can be applied to different proteins, providing an optimized solution in each case, and substantially saving time and resources. Second, our study is the first to conduct pairwise mutational analysis on the pattern of amino acids in the −4 to 0 region of the SP. Previous studies from Heijne et al. used small datasets of 65 to 188 functional SPs to perform statistical analysis on the preferred amino acids at single locations.^{Citation30–32} Our approach involves a much larger dataset, consisting of the top-scoring 800 (0.28%) mutants enriched from an initial library of 288,900 double aa SP mutants. It improves our understanding of the amino acid composition at the C-terminus of SPs.

There are several caveats of this method. First, as discussed earlier, the absolute CS score correlates poorly with the actual percentage of cleavage based on MS analysis data. As a result, inference about the percentage of miscleavage cannot be reliably made based on CS score alone. Second, not all the highest-scoring SP mutants will have reduced or non-detectable miscleavage. Moreover, because SignalP 5.0 does not have the ability to predict titer, some of the mutants might have low titers. Third, we observed that the length of the N-terminus of the mature protein affects the CS score, and this effect is most prominent with sequences shorter than 10 residues. While the degree of fluctuation differed depending on the exact N-terminal sequence of the protein, it is advisable to use longer sequences for prediction. In our study, we took the first 47 to 51 amino acids on the N-terminus of the light chains for prediction, thus capping the length of the entire input sequence (SP plus mature protein) at 70 amino acids, the maximum allowed by SignalP 5.0.

In summary, we demonstrated the success of this computational pipeline by screening an in silico library of wildtype and mutant SPs for each of the five antibodies, successfully solving their N-terminal miscleavage issues while maintaining similar or better titers. This method can be applied to different antibodies regardless of their initial extent of miscleavage, provides optimized solutions for individual cases, and facilitates the development of antibody therapeutics.

Materials and methods

Development of a computational pipeline for mutagenesis, prediction, and ranking

SignalP 5.0 standalone executable was licensed from the Technical University of Denmark. The executable has the same functions as the web version of SignalP 5.0,^Citation33 but allows command line access. To introduce functions such as mutagenesis, text file parsing, ranking and interactive plotting, none of which is available in the executable, we developed a Python package called “sp” (source code available in Supplementary Materials), and integrated it with the executable to build the complete computational pipeline. As shown in , the pipeline takes input from a collection of wildtype SPs and the N-terminal sequence of a mature protein. For each of the wildtype SPs, it produces a list of all theoretically possible single aa and double aa mutants, each harboring unique mutation(s) in the −4 to 0 region at the C-terminus of the SP (). Notably, previous studies by Heijne et al.^Citation30 used a numbering system consisting of negative and positive integers, but no zero. In our method, we assigned zero to the correct cleavage site, for the convenience of computation, because most algorithms embed sequences using consecutive integers. The wildtype and mutant SPs are then combined with the N-terminal sequence of the mature protein and fed into the executable for prediction. Next, the pipeline parses the resulting text files generated by the executable, and removes duplicated mutants derived from different wildtype SPs (concept illustrated in ). Finally, it outputs a CSV file of all SP mutants ranked descending according to the CS score at the correct cleavage site, a number produced by SignalP 5.0 executable. The CS score ranges from 0 to 1 and represents the marginal probability that a cleavage happens at the indicated site. The pipeline also generates interactive plots of sequence versus CS score on any wildtype or mutant SP that the user selects. All computations were done in a Linux environment on a high-performance computing cloud instance on Amazon Web Service.

Input antibodies and wildtype SPs

We selected five antibodies against different targets from patents (mAb 1,^Citation34 mAb 2,^Citation35 mAb 3,^Citation36 mAb 4,^Citation37 and mAb 5^Citation38). These antibodies have light-chain sequences belonging to the Lambda family. Due to the length limit of the input sequence (maximum 70 amino acid including SP) imposed by the SignalP 5.0 model, only the first 47 to 51 amino acids on the N-terminus of the light chains were used for prediction. Alignment of the first 52 amino acids at the N-terminus of the light chains showed that they were all unique (Supp. ). The light chains exhibited different extent of N-terminal miscleavage, ranging from 2.3% to 100% miscleaved, as confirmed by MS analysis. A collection of 84 unique wildtype SPs, ranging from 19 to 23 amino acids in length, was acquired from public database^{Citation39–42} (Supp. Table 2). The majority of these SPs come from human germline V-gene alleles. For each antibody, 31,1304 SP mutants (that is, 1 wildtype, 95 single aa mutants and 3,610 double aa mutants, for each of the 84 wildtype SP) were generated for prediction. After removal of duplicated mutants, 296,077 unique wildtype and mutant SP were retained and analyzed ().

Cloning of SP mutants

SP mutants were generated by routine gene synthesis of DNA encoding the mutated SP with flanking restriction enzyme sites. The DNA was then cloned by standard methods into the original plasmid, thereby substituting the original SP sequence with the mutant sequence. Transfection grade plasmid DNA was generated by standard amplification and purification methods.

Expression of SP mutants

Plasmids encoding SP mutants were transfected into CHO-3E7 cells (NRC-Canada) at 50 ml scale using 125 ml Erlenmeyer flasks (Corning 431143 or HTSLabs 931110) shaking at 125 rpm in a 50 mm throw shaker. The cells were grown at 35°C with 8% CO₂ in HyCell TransFx-C expression media (Cytiva SH30941.02) supplemented with 8 mM Glutamax (ThermoFisher 35050–061), 0.18% Pluronic F68 (ThermoFisher 24040–032), and 5 ml/L Penicillin-Streptomycin (ThermoFisher 15070–063). Cells were grown to a density of 2.5E6 cells/ml and transfected by mixing plasmid DNA (0.5 mg/L of culture with HC:LC ratio of 2:3) with 1 mg/ml PEI Max (PolySciences 24765–1) in a 1:6 ratio of DNA:PEI. DNA and PEI were mixed in 5% of the culture volume of TransFx-C media for 10 minutes before adding to the culture. Cells were fed with 10% BalanCD CHO Feed 4 (Irvine Scientific 94134) 24 hours post transfection, and again at 72 hours post transfection. In addition, the glucose level was adjusted to 10 g/L at 72 hours post transfection using 45% D-(+)-glucose solution (Sigma G8769). The expression was carried out at 35°C for 24 hours and then lowered to 32°C at feed 1. Expression was shaking at 125 rpm and incubated with 8% CO₂ for 10 days post transfection. The media containing the secreted antibody was separated from the cells by centrifugation at 3 K x G for 10 minutes, and then filtered through a 0.45 μM stericup (Millipore SCHVU01RE) or, for analytical samples, through a 0.45 μM Claristep filter (Sartorius 17C06TF – 96).

Analysis of expression titer and purification of analytical sample

The expression titer of the secreted antibody was determined by analytical HPLC using a Poros G 20 μM column (Thermo Fisher 1512224) on a Waters Acquity Arc HPLC system with a fraction collector. Briefly, the column was equilibrated with phosphate-buffered saline (PBS) and the samples were loaded to the column at a flow rate of 2.0 ml/minute and the column was washed with PBS prior to elution with elution buffer consisting of 150 mM NaCl and 0.016 N hydrochloric acid. The eluted antibody was collected in a 96-deepwell plate (Waters WAT058957) using the fraction collector and neutralized with 0.5 M Na₂PO_4. This sample was then used to perform MS. All production and analysis experiments were repeated at least twice.

Analysis of cleavage percentage by MS

The molecular weight of the purified antibodies was determined by reduced MS. Briefly, the sample was reduced by incubation for 15 minutes in presence of 50 mM TCEP (Thermo Fisher 77720). The MS/MS spectra were captured on an Agilent 6230 TOF LC/MS instrument, and the result was deconvoluted using Mass Hunter (Agilent) or Byos (Protein Metrics) software to determine the molecular weight of the individual antibody chains. The relative % cleavage of chains that were misprocessed was calculated by measuring the peak height of the misprocessed chain as a percent of the peak height of the intact chain.

Data analysis

Data analysis was done in Python. Kernel density estimation was conducted using a Gaussian kernel with a bandwidth calculated by the Scott method.^Citation43 Plots were generated using Matplotlib (version 3.4.3) and Plotly (version 4.14.3) packages. Sequence alignments were produced using Geneious Prime (version 2021) using the Geneious Alignment method.

Supplemental material

Supplemental Material

Download Zip (14.8 MB)

Acknowledgments

The authors would like to thank the following AbbVie colleagues: Laura Miesbauer and Mya Peterson (former AbbVie employee) for MS analysis support; Barbara Robertson, Lianchun Fan, and Ramesh Iyer for providing the SP sequences; Eric Hebert for providing antibody database for analysis; Diana Bowley and Christopher Lobo for technical support; Lili Huang, Andrew Goodearl, and Romit Majumdar for reviewing this manuscript and providing constructive feedback.

Data availability statement

Sequence of input SPs, source code of the “sp” package, and detailed data on each clone are provided in the Supplementary Materials.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed on the publisher’s website

Additional information

Funding

This work was fully funded by AbbVie Inc.

References

Nouwen N, Berrelkamp G, Driessen AJM. Bacterial sec-translocase unfolds and translocates a class of folded protein domains. J Mol Biol. 2007;372(2):422–13. doi:10.1016/j.jmb.2007.07.003.
PubMed Web of Science ®Google Scholar
Rapoport TA. Protein translocation across the eukaryotic endoplasmic reticulum and bacterial plasma membranes. Nature. 2007;450(7170):663–69. doi:10.1038/nature06384.
PubMed Web of Science ®Google Scholar
Pohlschröder M, Giménez MI, Jarrell KF. Protein transport in archaea: Sec and twin arginine translocation pathways. Curr Opin Microbiol. 2005;8(6):713–19. doi:10.1016/j.mib.2005.10.006.
PubMed Web of Science ®Google Scholar
von HG. The signal peptide. J Membr Biology. 1990;115:195–201. doi:10.1007/BF01868635.
PubMed Web of Science ®Google Scholar
Kapp K, Schrempf S, Lemberg MK, Dobberstein B. Post-Targeting Functions of Signal Peptides [Internet]. Landes Bioscience; 2009. Available from: https://www.ncbi.nlm.nih.gov/books/NBK6322/
Google Scholar
Owji H, Nezafat N, Negahdaripour M, Hajiebrahimi A, Ghasemi Y. A comprehensive review of signal peptides: structure, roles, and applications. Eur J Cell Biol. 2018;97(6):422–41. doi:10.1016/j.ejcb.2018.06.003.
PubMed Web of Science ®Google Scholar
Geukens N, Frederix F, Reekmans G, Lammertyn E, Mellaert LV, Dehaen W, Maes G, Anné J. Analysis of type I signal peptidase affinity and specificity for preprotein substrates. Biochem Bioph Res Co. 2004;314(2):459–67. doi:10.1016/j.bbrc.2003.12.122.
PubMed Web of Science ®Google Scholar
Duan G, Ding L, Wei D, Zhou H, Chu J, Zhang S, Qian J. Screening endogenous signal peptides and protein folding factors to promote the secretory expression of heterologous proteins in Pichia pastoris. J Biotechnol. 2019;306:193–202. doi:10.1016/j.jbiotec.2019.06.297.
PubMed Web of Science ®Google Scholar
Kang X-M, Cai X, Huang Z-H, Liu Z-Q, Zheng Y-G. Construction of a highly active secretory expression system in Bacillus subtilis of a recombinant amidase by promoter and signal peptide engineering. Int J Biol Macromol. 2020;143:833–41. doi:10.1016/j.ijbiomac.2019.09.144.
PubMed Web of Science ®Google Scholar
Liu Y, Shi C, Li D, Chen X, Li J, Zhang Y, Yuan H, Li Y, Lu F. Engineering a highly efficient expression system to produce BcaPRO protease in Bacillus subtilis by an optimized promoter and signal peptide. Int J Biol Macromol. 2019;138:903–11. doi:10.1016/j.ijbiomac.2019.07.175.
PubMed Web of Science ®Google Scholar
Román R, Miret J, Scalia F, Casablancas A, Lecina M, Cairó JJ. Enhancing heterologous protein expression and secretion in HEK293 cells by means of combination of CMV promoter and IFNα2 signal peptide. J Biotechnol. 2016;239:57–60. doi:10.1016/j.jbiotec.2016.10.005.
PubMed Web of Science ®Google Scholar
Zhou C, Xue Y, Ma Y. Characterization and high-efficiency secreted expression in Bacillus subtilis of a thermo-alkaline β-mannanase from an alkaliphilic Bacillus clausii strain S10. Microb Cell Fact. 2018;17(1):124. doi:10.1186/s12934-018-0973-0.
PubMedGoogle Scholar
Dtw N, Sarkar CA. Engineering Signal Peptides for Enhanced Protein Secretion from Lactococcus lactis. Appl Environ Microb. 2013;79(1):347–56. doi:10.1128/AEM.02667-12.
PubMed Web of Science ®Google Scholar
Shi L, Liu H, Gao S, Weng Y, Zhu L. Enhanced Extracellular Production of Is PETase in Escherichia coli via Engineering of the pelB Signal Peptide. J Agr Food Chem. 2021;69(7):2245–52. doi:10.1021/acs.jafc.0c07469.
PubMed Web of Science ®Google Scholar
Zhou Y, Liu P, Gan Y, Sandoval W, Katakam AK, Reichelt M, Rangell L, Reilly D. Enhancing full-length antibody production by signal peptide engineering. Microb Cell Fact. 2016;15(1):47. doi:10.1186/s12934-016-0445-3.
PubMedGoogle Scholar
Jonet MA, Mahadi NM, Murad AMA, Rabu A, Bakar FDA, Rahim RA, Low KO, RMd I. Optimization of a Heterologous Signal Peptide by Site-Directed Mutagenesis for Improved Secretion of Recombinant Proteins in Escherichia coli. J Mol Microb Biotech. 2012;22(1):48–58. doi:10.1159/000336524.
PubMed Web of Science ®Google Scholar
Gibson SJ, Bond NJ, Milne S, Lewis A, Sheriff A, Pettman G, Pradhan R, Higazi DR, Hatton D. N terminal or signal peptide sequence engineering prevents truncation of human monoclonal antibody light chains. Biotechnol Bioeng. 2017;114(9):1970–77. doi:10.1002/bit.26301.
PubMed Web of Science ®Google Scholar
Savojardo C, Martelli PL, Fariselli P, Casadio R. DeepSig: deep learning improves signal peptide detection in proteins. Bioinformatics. 2017;34(10):1690–96. doi:10.1093/bioinformatics/btx818.
Web of Science ®Google Scholar
J-m W, Liu Y-C, Chang DT-H. SigUNet: signal peptide recognition based on semantic segmentation. BMC Bioinform. 2019;20(S24):677. doi:10.1186/s12859-019-3245-z.
PubMedGoogle Scholar
Zheng Z, Chen Y, Chen L, Guo G, Fan Y, Kong X. Signal-BNF: a Bayesian Network Fusing Approach to Predict Signal Peptides. J Biomed Biotechnol. 2012;2012:492174. doi:10.1155/2012/492174.
PubMedGoogle Scholar
Reynolds SM, Käll L, Riffle ME, Bilmes JA, Noble WS. Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks. Plos Comput Biol. 2008;4(11):e1000213. doi:10.1371/journal.pcbi.1000213.
PubMed Web of Science ®Google Scholar
Zhang W-X, Pan X, Shen H-B. Signal-3L 3.0: Improving Signal Peptide Prediction through Combining Attention Deep Learning with Window-Based Scoring. J Chem Inf Model. 2020;60(7):3679–86. doi:10.1021/acs.jcim.0c00401.
PubMed Web of Science ®Google Scholar
Armenteros JJA, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von HG, Nielsen H. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol. 2019;37(4):420–23. doi:10.1038/s41587-019-0036-z.
PubMed Web of Science ®Google Scholar
Garcion C, Béven L, Foissac X. Comparison of Current Methods for Signal Peptide Prediction in Phytoplasmas. Front Microbiol. 2021;12:661524. doi:10.3389/fmicb.2021.661524.
PubMed Web of Science ®Google Scholar
Jeiranikhameneh M, Moshiri F, Falasafi SK, Zomorodipour A. Designing Signal Peptides for Efficient Periplasmic Expression of Human Growth Hormone in Escherichia coli. J Microbiol Biotechn. 2017;27(11):1999–2009. doi:10.4014/jmb.1703.03080.
PubMed Web of Science ®Google Scholar
Tehrani SS, Goodarzi G, Naghizadeh M, Khatami SH, Movahedpour A, Abbasi A, Shabaninejad Z, Khalaf N, Taheri-Anganeh M, Savardashtaki A. Suitable Signal Peptides for Secretory Production of Recombinant Granulocyte Colony Stimulating Factor in Escherichia coli. Recent Patents Biotechnology. 2020;14(4):269–82. doi:10.2174/1872208314999200730115018.
PubMedGoogle Scholar
Mori A, Hara S, Sugahara T, Kojima T, Iwasaki Y, Kawarasaki Y, Sahara T, Ohgiya S, Nakano H. Signal peptide optimization tool for the secretion of recombinant protein from Saccharomyces cerevisiae. J Biosci Bioeng. 2015;120(5):518–25. doi:10.1016/j.jbiosc.2015.03.003.
PubMed Web of Science ®Google Scholar
Klatt S, Konthur Z. Secretory signal peptide modification for optimized antibody-fragment expression-secretion in Leishmania tarentolae. Microb Cell Fact. 2012;11:1. doi:10.1186/1475-2859-11-1.
PubMed Web of Science ®Google Scholar
Shanthalingam S, Srikumaran S. Intact signal peptide of CD18, the beta-subunit of beta2-integrins, renders ruminants susceptible to Mannheimia haemolytica leukotoxin. PNAS. 2009;106(36):15448–53. doi:10.1073/pnas.0906775106.
PubMed Web of Science ®Google Scholar
HEIJNE GV. Patterns of Amino Acids near Signal-Sequence Cleavage Sites. Eur J Blochem. 133:17–21 . doi:10.1111/j.1432-1033.1983.tb07424.x . 1983.
PubMedGoogle Scholar
von HG. Signal sequences The limits of variation. J Mol Biol. 1985;184(1):99–105. doi:10.1016/0022-2836(85)90046-4.
PubMed Web of Science ®Google Scholar
von HG. How signal sequences maintain cleavage specificity. J Mol Biol. 1984;173(2):243–51. doi:10.1016/0022-2836(84)90192-X.
PubMed Web of Science ®Google Scholar
SignalP 5.0 Web Server [Internet]. Available from: http://www.cbs.dtu.dk/services/SignalP/
Google Scholar
Hagemann U, Griep R, Reiersen H, Kiprijanov S. Anti CCR4 antibodies and uses thereof. World Intellectual Property Organization. (WO 2012/076883 Al). 2012.
Google Scholar
Silence K, Ulrichts P, Jjwd H, Dreier T, Saunders MJS, Wajant H, Gabriels SME, Moshir M. Antibodies to CD70. World Intellectual Property Organization. 2011. p. WO2012123586A1.
Google Scholar
Adams C. .DR5 antibodies and uses thereof (US20070031414A1). United States Patent and Trademark Office. 2005.
Google Scholar
Kwong P, Kwong YD, Georgiev I, Ofek G, Zhang B, Mckee K, Macsola J, Chuang G, O’Dell S, Gindin T, et al. Neutralizing antibodies to HIV-1 gp41 and their use(WO2017079479A1). World Intellectual Property Organization. 2015.
Google Scholar
Hsieh C-M, Ivanov A, Waegell W. Immunobinders directed against sclerostin (WO2013/063095). World Intellectual Property Organization. 2011.
Google Scholar
PDB Protein Data Bank [Internet]. Available from: https://www.rcsb.org/
Google Scholar
Signal Peptide Database [Internet]. Available from: http://signalpeptide.de/?m=searchspdb
Google Scholar
Human V-gene germline signal peptide database [Internet]. Available from: https://www2.mrc-lmb.cam.ac.uk/vbase/alignments2.php
Google Scholar
NCBI [Internet]. Available from: https://www.ncbi.nlm.nih.gov/
Google Scholar
Scott DW. Multivariate Density Estimation: Theory, Practice, and Visualization. Hoboken (NJ): John Wiley & Sons, Inc.; 1992.
Google Scholar

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

In silico high throughput mutagenesis and screening of signal peptides to mitigate N-terminal heterogeneity of recombinant monoclonal antibodies

ABSTRACT

Introduction

Results

Comparison of CS scores of heavy chains, Lambda light chains, and Kappa light chains from an in-house collection of antibodies

Impact of the number of residues at the N-terminus of the mature protein on the CS score

Table 1. Summary of MS cleavage and production titers of the SP mutants tested

Percentage of mutant SPs in the in silico library with CS scores above 0.8

Impact of the number of mutations on the CS score

Improvement of the CS score with single and double aa mutations

Experimental validation of the computational pipeline

Pairwise mutational analysis of amino acid patterns at the C-terminus of SPs

Discussion

Materials and methods

Development of a computational pipeline for mutagenesis, prediction, and ranking

Input antibodies and wildtype SPs

Cloning of SP mutants

Expression of SP mutants

Analysis of expression titer and purification of analytical sample

Analysis of cleavage percentage by MS

Data analysis

Supplemental Material

Acknowledgments

Data availability statement

Disclosure statement

Supplementary material

References

Information for

Open access

Opportunities

Help and information

In silico high throughput mutagenesis and screening of signal peptides to mitigate N-terminal heterogeneity of recombinant monoclonal antibodies

ABSTRACT

Introduction

Results

Comparison of CS scores of heavy chains, Lambda light chains, and Kappa light chains from an in-house collection of antibodies

Impact of the number of residues at the N-terminus of the mature protein on the CS score

Table 1. Summary of MS cleavage and production titers of the SP mutants tested

Percentage of mutant SPs in the in silico library with CS scores above 0.8

Impact of the number of mutations on the CS score

Improvement of the CS score with single and double aa mutations

Experimental validation of the computational pipeline

Pairwise mutational analysis of amino acid patterns at the C-terminus of SPs

Discussion

Materials and methods

Development of a computational pipeline for mutagenesis, prediction, and ranking

Input antibodies and wildtype SPs

Cloning of SP mutants

Expression of SP mutants

Analysis of expression titer and purification of analytical sample

Analysis of cleavage percentage by MS

Data analysis

Supplemental Material

Acknowledgments

Data availability statement

Disclosure statement

Supplementary material

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date