1,651
Views
1
CrossRef citations to date
0
Altmetric
Research Articles

Using structural analysis to explore the role of hepatitis B virus mutations in immune escape from liver cancer in Chinese, European and American populations

, , , , , , , & show all
Pages 1586-1596 | Received 07 Apr 2020, Accepted 25 Sep 2020, Published online: 08 Oct 2020

Abstract

Hepatitis B virus (HBV) infection is an important problem threatening human health. After HBV virus invades human body, it may assemble a complete virus particle in the cytoplasm to trigger the immune reaction, especially the interaction between the HBV virus and the host that mediated by CD8+ T cell. We collected the sequences of HBV from the HBVdb database, then screened candidate mutation sites in Chinese, European and American populations based on conservation and physicochemical properties. After that we constructed the three-dimensional structure of Major histocompatibility complex class I (MHC I) -peptide complexes, performed molecular docking, run molecular dynamics to compare the binding free energy, stability, and affinity of MHC I-peptide complexes with the aim to estimate the effect of peptide mutation. The specific HBV virus subtypes of the Chinese, European and American population were studied and the candidate mutation sites were used to predict the mutant peptide antigen. Finally, based on physical and chemical properties and peptide antigen prediction scores, 21 HBV mutation sites were selected. Then combined with specific Human lymphocyte antigen (HLA) subtypes, 11 mutations were found to have a significant negative impact on affinity, stability and binding free energy. Overall, our work found important potential mutations, which provide an evaluation of HBV mutations and a clue of it in immunotherapy.

Communicated by Ramaswamy H. Sarma

Introduction

HBV infection is seriously threatening human health. According to the latest data released by the World Health Organization 2017, there are about 350 million people with chronic HBV infection in the world. There are about 1 million people die directly from HBV infection-related diseases every year. China is a high prevalence area of HBV with 93 million infected people, including 25-30 million chronic HBV patients(Francisco et al., Citation2017; Fu-Sheng et al., Citation2014). HBV virus infection and disease are mainly caused by the interaction between virus and specific HLA in different populations, which mediated by CD8+T cell (Levrero & Zucman-Rossi, Citation2016). Previous studies have shown that the genetic background of the host plays a crucial role in HBV infection and its prognosis (S. Li et al., Citation2012; Pan et al., Citation2013; Tan et al., Citation2008), different ethnic groups may be susceptible to different genotypes of HBV virus strains, and many tools used to study the interaction between HBV and HLA molecules in host immune system have been designed for the European and American populations currently (Rivino et al., Citation2013). However, the mechanisms of the interaction between susceptible HBV strains and specific MHC allotype in Chinese, European and American populations have not been systematically studied.

HBV is a DNA virus, whose genome is extremely small, so its protein is encoded by multiple overlapping reading frames, of which four open reading frames are S region, C region, P region, and X region (Cossart & Field, Citation1970). The S region encodes a surface antigen, an envelope protein of hepatitis B virus, which mediate the recognition between HBV virus and hepatocyte receptors (Gripon et al., Citation2005). The C region encodes the core antigen (HBcAg) and the hepatitis B e antigen (HBeAg) that buried inside HBcAg. HBcAg constitutes an icosahedral nucleocapsid, which encapsulates its mRNA to form core particles containing antigenic determinants, which is the key target for the immune response (Al-Qahtani et al., 2018; Tong & Revill, Citation2016; Wang et al., Citation1991).

After HBV infects into the host, the HBV virus genome has long been under the choice of immune pressure, and may generate new mutant strains through constant mutation to carry out immune escape, especially triggering the immune response associated with CD8+ T cells. HBV mutations are mainly found by sequencing, and those with a mutation frequency greater than 10% are defined as hotspot. In the current study, the correlation between hotspot and liver cancer is mainly assessed by meta-analysis and statistical methods. (Wei et al., Citation2017). The antigen produced by the intracellular HBV virus is processed into peptide antigen, which is then expressed on the cell surface by MHC I (Levrero & Zucman-Rossi, Citation2016; Webster et al., Citation2004), and is recognized by the T cell receptor (TCR) to form a pMHC-TCR ternary body molecule.

However, little is known about the mechanisms of immune-stress-related HBV mutations in T cell epitopes and their association with disease progression (Schuch et al., Citation2014). Previous studies have shown that many mutations reduce the antigenicity of the peptides they produce and affect the recognition of HBV virions by the immune system, this phenomenon may affect the formation of subsequent pMHC-TCR complex by reducing the binding of mutant antigenic peptides to MHC class I molecules, and induce the occurrence of immune escape (Bertoletti & Ferrari, Citation2016). In immunotherapy, one of the common methods is to promote the existing HBV-specific T cells, enhance the recognition of HBV-infected targets to achieve the goal of clearing the virus (Bertoletti & Le Bert, Citation2018). Antigen molecules from their initial state to the antigenic peptides that presented on the surface of MHCI molecules need to be degraded by proteasome, then further modified by heat shock proteins HSP70 and HSP90, and then selectively transported to the endoplasmic reticulum by the relevant transporter TAP. Then, the processed antigenic peptides will be submitted by gp69 to the newly synthesized class I molecules, and assembled under the action of chaperones such as tapasin, calanexin, calreticulin, etc. The formed complex is transported to the cell surface via the golgi apparatus for recognition by CD8 + T cells and stimulates immune escape (Riedl et al., Citation2006). Our focus is on the last but crucial final step-antigen presentation, to explore whether the mechanism of the binding between mutant antigenic peptides and MHC molecules can give a little inspiration to immunotherapy for liver cancer.

Through the investigation of the published literature, it is found that the existing HBV database is mostly functional, and the data are basically from Genbank (Hayer et al., Citation2013; Li et al., Citation2013 ). We choose HBVdb database, which source as European ENA (Nucleotide Archive), shares data with Genbank, DDBJ (DNA Data Bank of Japan), and keep updating to get data. We performed alignment and conservation analysis on 11,384 HBV virus sequences based on different HBV subtypes and coding regions, obtained 172 mutations that fell in the conserved regions. Finally, 21 key mutations were finally obtained through the physicochemical property analysis and antigenic peptides screening. Combined with the specific HLA subtypes of Chinese, European and American populations, the predicted antigenic peptides corresponding to the mutation was docked with HLA to construct a peptide-MHC complex and calculate the affinity and stability of the complex. Then, the molecular dynamics simulation was performed to calculate the binding free energy (MM/GBSA) and to predict the effect of the mutation on the complex.

Materials and methods

Data collection, conservation and physicochemical properties analysis

The amino acid sequences of the HBV S (SHBs) and C (PreC) regions of A, B and C subtypes in this study were downloaded from the HBVdb database (Hayer et al., Citation2013). There were 3349 sequences in the S region of subtype A, 1523 sequences in C region, 5594 sequences in S region of B subtype, 2862 sequences in C region, 9468 sequences in S region of C subtype, and 2801 sequences in C region. The highest frequency HLA-A subtype in Chinese and European or American populations comes from Han-MHC (Fusheng, Citation2016). The three-dimensional structure of HLA-A*1101 (PDB ID: 2HN7, 5GRG, 5GSD, 5GRD, 4UQ2) and HLA-A*0201(PDB ID: 1I1F, 5YXU, 4UQ3, 2HN7, 2V2W) were collected. Aligned amino acid sequences can be found in the DATASET in the HBVdb database. And PSIPRED 4.0 and TMHMM were used to predict the physicochemical properties of mutant amino acids, including helix and topology of transmembrane (Buchan & Jones, Citation2019; Krogh et al., Citation2001).

Prediction of antigenic peptide

Here, we utilized a new framework, NetMHCPan4.0, to predict antigenic peptides according to the specific HLA subtypes of Chinese and European or American populations. The length of predicted peptide antigen was across 8-14 amino acids and the optimal candidate was selected according to the score and %Rank provided by predict results. (Jurtz et al., Citation2017).

Protein preparation

The peptides in peptide-MHC complex were derived from the NetMHCPan4.0 predicted results, and the MHC molecules were obtained from the PDB. As for MHC protein, we used structures of HLA-A*11:01(PDB ID: 2HN7, 5GRG, 5GSD, 5GRD, 4UQ2) and HLA-A*02:01(PDB ID: 1I1F, 5YXU, 4UQ3, 2HN7, 2V2W).

The primary structures were performed with the following structural preparations. Firstly, the original structure was optimized using a hydrogen bond network. Secondly, the water molecules were manipulated by optimizing hydrogen bonding. Thirdly, exist strain was mitigated and the position of each set was fine tuned. Missing sides and loop were also filled in by Prime (Sastry et al., Citation2013). Finally, we obtained prepared proteins.

Protein interaction analysis and peptide docking

The interaction between original HLA and peptide was performed by the “Protein Interaction Analysis” workflow in Bioluminate. The docking was accomplished by Glide 5.7 program in Schrodinger Suite. Notably, the docking grid was generated using “Centroid of selected residues” with selected residue that is based on the results of Protein Interaction Analysis. In addition, GlideScore value was applied in the docking protocol, and the models of top scores were selected as the final docking poses. (Beard et al., Citation2013).

Molecular dynamics simulation, RMSD calculation and structural analysis

The docking poses were used as the initial structure in MD simulation, all MD simulations were carried out using Desmond module in Schrodinger Suite with OPLS3e force field. The simulation system was put in an orthorhombic periodic box of SPC water molecules, the distance between the complex and the box boundary was set to 10.0 Å along each dimension. Relaxing model system was carried out before simulations to optimize high energy, then the MD simulations were conducted at 1 atm and 300 K under the NPT (constant particles N, temperature T, and pressure P) ensemble for 100 ns (Piao et al., Citation2019). The trajectory snapshots were saved every 40 ps, and 1,250 frames were collected for further analyses (Ylilauri & Pentikainen, Citation2013).

RMSD was calculated by Desmond module, indicating the degree of molecular structure change. Residue Scanning was calculated by Bioluminate with default parameter to specify the mutations of interest.

MM/GBSA binding energy calculation

Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) has been widely used to calculate binding free energy for protein-ligand systems. The calculations are done from the MD trajectories with Prime MM-GBSA, to predict the free energy of binding (ΔGbind) between protein-peptide or protein-protein complexes based on the molecular dynamics (Ylilauri & Pentikainen, Citation2013). The binding free energy of MM/GBSA was calculated as the following equation:

ΔGbind = Gcomplex – Gprotein– Gligand, where ΔGbind is the binding free energy, and Gcomplex, Gprotein, and Gligand are the free energies of complex, protein, and ligand. The equation ΔEMM + ΔGGB + ΔGnonpolar – TΔS is used to estimate the energy, ΔEMM is the gas-phase interaction energy between protein and the binding peptide, the polar and nonpolar components of the desolvation free energy represented by ΔGGB and ΔGnonpolar, and TΔS is the change of conformational entropy on ligand binding.

Results

Collection of HBV mutation sites in Chinese, European and American populations

Due to the significant regional differences in HBV genotypes, the HBV genotypes in China are mainly B-type in the south, C-type in the north, and A-type as the main subtype in European and American populations (Sunbul, Citation2014), the amino acid sequences of HBV viruses of subtypes A, B, and C were collected in the HBVdb database. Then we focused on the S and C coding region, which encodes hepatitis B surface antigen and core antigen respectively. The sequences were aligned with the ClusterW module in Mega7.0 (Kumar et al., Citation2016) and the mutation at each locus was counted and displayed in the form of a radar chart (). The mutated amino acids were based on amino acid distribution and physicochemical properties: hydrophobicity (V, I, L, M, W, Y, C), polarity (N, Q, S), positive charge (R, K, H), negative charge (D, E), conformation (G, P), classification in the AT area (Ke, Citation2015; Shen & Vihinen, Citation2004). The statistical results showed that the mutations in the S and C regions of different ethnic groups was different. In the European and American populations, the C coding region of the HBV subtype was more likely to be mutated to Q, while the S coding region tended to be mutated to T. In contrast, the C coding region of the Chinese population susceptible to HBV subtypes was mutated to T, V, I, G, but the S coding region was L, S, R. We found the physicochemical properties indicate that the hydrophobic group has the highest mutation frequency in terms of functionality.

Figure 1. The Radar map of amino acid sequence alignment results of hepatitis B virus subtype A, B, and C.

A. The relative mutability of mutated residues in C and S regions of HBV virus based on 20-letter alphabet amino acids.

B. The relative mutability of mutated residues in the C and S regions of HBV viruses based on the definition of 6 groups, which can divide amino acids into following 6 groups: hydrophobicity (V, I, L, M, W, Y, C), polarity (N, Q, S), positive charge (R, K, H), negative charge (D, E), conformation (G, P), and AT area.

Figure 1. The Radar map of amino acid sequence alignment results of hepatitis B virus subtype A, B, and C.A. The relative mutability of mutated residues in C and S regions of HBV virus based on 20-letter alphabet amino acids.B. The relative mutability of mutated residues in the C and S regions of HBV viruses based on the definition of 6 groups, which can divide amino acids into following 6 groups: hydrophobicity (V, I, L, M, W, Y, C), polarity (N, Q, S), positive charge (R, K, H), negative charge (D, E), conformation (G, P), and AT area.

Identification of 21 potential candidate mutation sites

To find the mutation sites in conserved regions, the sequences were analyzed by ClusterW1.81 in the HBVdb database (). At the same time, considering the specificity of the viral genome subtype, ethnicity, and distribution area, we selected the most frequently occurring amino acids at each locus as a reference (). The hydrophobic interaction of amino acid residues is the main force for maintaining and stabilizing the conformation of the protein, so we used PSIPRED to screen for hydrophobic amino acid mutation sites, and finally found 37 mutation sites (, ). Peptide antigen prediction was performed on the screened mutation sites using NetMHCPan4.0. Conditions were set for HLA-A*0201 and HLA-A*1101. The peptide length was set to 8-14 amino acids. The threshold of the strong binder was set to 0.5, and the threshold of the weak binder was set to 2. At the same time, we searched for HLA-A*1101 and HLA-A*0201 subtypes in AFND (Allele Frequency Net Database) (González-Galarza et al., 2015). The results also showed that these two subtypes had the highest frequency in European and American populations and Southeast Asian populations.

Table 1. The mutation sites in the conserved region.

Table 2. 37 mutation sites selected based on hydrophobicity.

According to the simulation results of NetMHCPan4.0, HBV mutations with the highest predicted MHC binding affinity were preserved, and we obtained the epitope sequences of 21 mutation sites finally (). It was found that most of the predicted antigenic peptides were 9-10 amino acids in length, which was consistent with the distribution characteristics of T cell epitopes corresponding to specific HLA subtypes (Trolle et al., Citation2016).

Table 3. Mutation sites selected based on antigen peptide prediction score (Cutoff is rank%<2) The one that labeled red is an antigenic peptide that has appeared in the hepitopes database, the rank% threshold for strong binding peptides is 0.500, and the weak one is 2.000.

However, it has been reported that the PreS1 domain, buried in viral membrane, directly interacts with the sodium-taurocholate cotransporting polypeptide (NTCP). Whereas the α-determinant, which locates on the small S domain outside the membrane, interacts with heparin sulfate proteoglycan on hepatocytes, increasing the concentration of HBV virus particles on the surface of liver cells (Bertoletti & Ferrari, Citation2016). So, we want to explore whether the effects of mutations inside and outside the membrane are different, which play a key role in virus infection. Combined with the predicted result of transmembrane region using TMHMM server v.2.0 (), six conservative candidate mutation sites of transmembrane helical regions of subtype A, I199V, I202V/M, I256M, V351A, V354A and L389V, were found in the S region.

Figure 2. Prediction of transmembrane regions of S region.

This picture shows the predicted map of transmembrane region. The abscissa is the position of the amino acid residue, and the ordinate is the prediction possibility. The black line indicates that the position may be in the intracellular segment; the red line denotes the outside region; and the vertical line on the X axis represents the prediction of transmembrane region.

Figure 2. Prediction of transmembrane regions of S region.This picture shows the predicted map of transmembrane region. The abscissa is the position of the amino acid residue, and the ordinate is the prediction possibility. The black line indicates that the position may be in the intracellular segment; the red line denotes the outside region; and the vertical line on the X axis represents the prediction of transmembrane region.

Moreover, 13 of the 21 antigenic peptides predicted from the mutation sites were included in the hepitopes database(Lumley et al., Citation2016), which collected all HBV epitopes corresponding to HLA class I genes based on published literature (). This indicates that our predictions have certain credibility. Seven antigenic peptides were not reported, and the corresponding mutation sites were I32F/V, L44F, L98F, L60I, V149A, I155V/M, L213F, and I256M. The Signal P server result reveals that (Almagro Armenteros et al., Citation2019) that our collected sites are not in the signal peptide region ().

Docking of the HLA and peptide

Structures used for docking, including HLA-A*0201 (PDB ID: 1I1F, 5YXU, 4UQ3, 2HN7, 2V2W) and HLA-A*1101 (PDB ID: 2HN7, 5GRG, 5GSD, 5GRD, 4UQ2), were obtained from PDB. These structures were conducted by protein interaction analysis, which analyzed the positions of the amino acid residues of MHC molecules in contact with peptides. We selected the repetitive sites as the sites in the binding pocket of MHC and incorporated these sites as constraints in the docking. In addition, we docked the peptides with each of the selected PDB structures. and selected the optimal structure according to the docking scores. then performed molecular dynamics simulations on the corresponding complexes, and finally selected the best receptor of each peptide based on the RMSD results ().

Figure 3. Binding models of peptide FLLTRILTI and MHC I molecular proteins.

MHC and peptide are separately colored in gray and magenta, key residues are shown in stick model. Hydrogen bonds are shown in yellow dash lines.

Figure 3. Binding models of peptide FLLTRILTI and MHC I molecular proteins.MHC and peptide are separately colored in gray and magenta, key residues are shown in stick model. Hydrogen bonds are shown in yellow dash lines.

The scoring system of the peptide docking module is mainly evaluated by calculating the binding free energy in the implicit solvent. For example, the binding mode of the peptide antigen FLLTRILTI produced by I199 to HLA is shown in figure (). It can be seen that the peptide binds to the groove of the HLA. L3, I6, and L7 residues pointing to the outer space and do not contact HLA protein. F1, L2, and I9 which are inside the groove. Moreover, F1 contacts TYR171 and TYR159 on the groove to form hydrogen bonds, and L2 and I9 also form hydrogen bonds with GLU63 and TYR84, respectively. Clearly, HBV virus mutations mentioned above in these residues may affect the binding between peptide and HLA.

The impact of the 21 mutations on complex formation

Based on the docked MHC-peptide complex, we performed a residue scanning calculation on these 21 candidate mutations. The results showed that the eight mutations: L149V, I156L, L213F, I256M, V351A, V354A, M372V/I and L389V in subtype A and V149A, I155V/M in subtype B resulted in a change in affinity greater than +3 kcal/mol. It can be speculated that the mutations have a significant effect on the affinity between the peptide and MHC, which may result in weakening binding of the antigenic peptide to MHC and immunological escape. In addition, mutations of L213F, I256M, M372V/I and I155M also resulted in a decrease in the overall stability of the complex (). Among them, the mutation of M372V caused the most negative effects, and the changes of affinity and stability were as high as +10.96 kcal/mol and +8.22 kcal/mol, respectively.

Figure 4. Residue Scanning Computation of 21 mutation sites.

A. Results of residue scanning are shown on the genome. The inner circle displays the HBV viral genome structure and its encoded protein. The first outer ring (green) indicates the change in affinity produced by the corresponding site mutation, and the outermost layer (blue) indicates the resulting change in stability. The degree of change is expressed in terms of the height of the column, where the negative result is marked in red.

B. Comparison of results. The left and right images represent mutations that occur in subtype A and B/C, respectively. The ordinate indicates the mutation site. The horizontal axis represents ΔΔGstability (kcal/mol) and ΔΔGaffinity (kcal/mol), marked with blue and red. The threshold is set at +3kcal/mol (yellow line). Positive values indicate that the mutation has a negative impact on affinity and stability.

Figure 4. Residue Scanning Computation of 21 mutation sites.A. Results of residue scanning are shown on the genome. The inner circle displays the HBV viral genome structure and its encoded protein. The first outer ring (green) indicates the change in affinity produced by the corresponding site mutation, and the outermost layer (blue) indicates the resulting change in stability. The degree of change is expressed in terms of the height of the column, where the negative result is marked in red.B. Comparison of results. The left and right images represent mutations that occur in subtype A and B/C, respectively. The ordinate indicates the mutation site. The horizontal axis represents ΔΔGstability (kcal/mol) and ΔΔGaffinity (kcal/mol), marked with blue and red. The threshold is set at +3kcal/mol (yellow line). Positive values indicate that the mutation has a negative impact on affinity and stability.

In order to consider the relax process of the mutant protein and perform global analysis including interactions between residues or domains, molecular dynamics simulation was used for subsequent analysis. Mutation with affinity change greater than +3kcal/mol was selected, and the predicted peptide-MHC complex was simulated with 100 ns molecular dynamics. The RMSD of backbone evolution with time was calculated by using the initial structure as a reference structure, and the binding free energy of the period was calculated based on the trajectories with stable RMSD values. The effects of the mutation were explained by comparing the changes of binding free energy between wild type and mutant type. For example, for the wild type and mutation of L389V of HBV peptide, we conducted 100 ns MD simulations. Based on the RMSD analysis, we found that the system of wild type and L389V went to equilibrium after 20 and 30 ns, respectively (). The MM/GBSA binding free energy was calculated by using the snapshots extracted from the trajectory of 20-100 ns in the wild type system and 30-60 ns in the L389V system. It was found that the Leu mutated into Val cause the absolute values of the binding free energy to decrease from 69.48 kcal/mol to 61.73 kcal/mol. Consistent with the result of the residue scanning, the binding free energy analysis results confirmed that the mutation would weaken the affinity between the HLA molecule and the peptide antigen (, S3, and Table S4). We have also selected peptide antigen FLPSDFFPSV (hepatitis B core 18-27), which have confirmed the mutations at position P10 would reduce the binding of peptide to MHC I molecules by classical laboratory methods (Bertoletti et al., Citation1994). The mutation from Val to Ala resulted in the decrease of binding free energy from 81.99 kcal/mol to 72.79 kcal/mol, which was consistent with the experimental results, indicating that the bioinformatics analysis method we adopted had certain reliability.

Figure 5. RMSD of docking structure produced by mutant L389V.

A stable period (20ns∼100ns, 30 ∼ 60ns) demonstrates a stable structure. The arrow points the structure that we selected to run MM/GBSA calculation. The MM/GBSA values of the wild type (L389) and mutant type (L389V) were -69.48kcal/mol and -61.73kcal/mol, respectively.

Figure 5. RMSD of docking structure produced by mutant L389V.A stable period (20ns∼100ns, 30 ∼ 60ns) demonstrates a stable structure. The arrow points the structure that we selected to run MM/GBSA calculation. The MM/GBSA values of the wild type (L389) and mutant type (L389V) were -69.48kcal/mol and -61.73kcal/mol, respectively.

Figure 6. MM/GBSA calculation results of mutation sites.

The x-axis represents the mutation sites and the ordinate is the values (kcal/mol) of MM/GBSA. The red and blue columns represent the MM/GBSA values of the wild-type and the mutant-type, respectively.

Figure 6. MM/GBSA calculation results of mutation sites.The x-axis represents the mutation sites and the ordinate is the values (kcal/mol) of MM/GBSA. The red and blue columns represent the MM/GBSA values of the wild-type and the mutant-type, respectively.

In general, the second (P2) and ninth (P9) residues of antigenic peptides bound to HLA-A *02:01 are usually hydrophobic residues, such as Leu, Val, or Ile, and their interactions are usually mediated by hydrogen bonds. Based on this, we chose mutations that occurred at positions 2 and 9 that were originally hydrophobic amino acids. Finally, four mutation sites L149, I156, L170 and V354 were selected. We analyzed the effect of anchor residues mutation on non-covalent bonds between peptides and MHC. It was found that the mutation at P2 of the mutant peptide antigen YL/VSFGVWI produced by L149V and the mutation at P9 of the mutant peptide YLVSFGVWI/L produced by I156L resulted in a reduction in the total number of hydrogen bonds generated. The L149V mutation caused the original hydrogen bond formed by P2 and TRP167 on HLA to disappear. Similarly, the I156L mutation at P9 of peptide led to the disappearance of the hydrogen bonds originally formed by Ile and ARG97 (). In the peptide antigen formed by L170V (IL/VSTLPETTV) mutation, Leu initially bound to ASP78 and ARG98 to form a hydrogen bond, which disappeared after mutation (). The same is true for the mutant peptide antigen WLSLLVPFV/A produced by V354A, the mutation caused the hydrogen bond generated by the original P9 and GLN155 to disappear, the non-covalent binding force between the anchor residue and HLA was reduced ().

Figure 7. Non-covalent pattern between peptide and HLA.

The picture shows the covalent bond between the antigenic peptide and HLA. MHC and peptide are separately colored in yellow and gray. The mutant amino acids are in red, the yellow dotted line is the hydrogen bond.

A-D. Non-covalent bond pattern between HLA and residue of P2or P9position of peptide antigen that corresponding to L149V, I156L, L170V, and V354A before and after mutation.

Figure 7. Non-covalent pattern between peptide and HLA.The picture shows the covalent bond between the antigenic peptide and HLA. MHC and peptide are separately colored in yellow and gray. The mutant amino acids are in red, the yellow dotted line is the hydrogen bond.A-D. Non-covalent bond pattern between HLA and residue of P2or P9position of peptide antigen that corresponding to L149V, I156L, L170V, and V354A before and after mutation.

We also performed non-covalent bond analysis for other mutated sites, but their effect on the non-covalent binding force between peptide and HLA was not very significant. This may be due to the amino acid properties that did not change much before and after the mutation, and it is possibly because of the predicted antigenic peptides are not all standard nonapeptides. Previous studies have shown that when the length of the antigenic peptides is more than 9 amino acids, the anchor residue originally thought will migrate (Phillip, Citation2017).

Discussion

Previous work showed increased expression of MHC class I molecules in HBV-infected liver cancer cell lines and liver cancer tissues, but it did not promote CD8+ T cell-mediated immune clearance, which may be related to the immune escape of HBV virus (Lianhong, Citation2010; Shen et al., Citation2009). Moreover, different ethnic groups may be susceptible to different genotypes of HBV strains, so we still need to consider the different genotypes of HBV and the genetic background of the host.

To clarify the mechanism of immune escaping caused by HBV mutations in Chinese, European and American populations, the amino acid sequences of various regions of different HBV subtypes were collected from the HBVdb database. The B, C subtypes are dominant in Chinese and the A subtype is dominant in European and American population. After alignment, the mutations in the conserved regions of the HBV viral genome were identified. At the same time, we calculated the mutations of each subtype S region and C region as a whole, and found that the hydrophobic amino acids had the highest mutation rate. Furthermore, the hydrophobic amino acids in the conserved region were selected to predict antigenic peptides, and finally obtain 21 antigenic peptides. As far as the MHC structures are concerned, we used HLA-A*0201 and HLA-A*1101 subtypes which are the highest frequency HLA subtypes among Chinese and European and American populations (Gragert et al., Citation2013). Investigation results of these two subtypes in AFND (Allele Frequency Net Database) also confirmed this. Then the antigenic peptides were docked with the corresponding HLA structures, and the compounds were subjected to residue scanning to study the effect of the mutation on the affinity and stability of the MHC-peptide complex. The results showed that, for example, mutations of L213F and I256M both had significant effects on the affinity of the complex, with affinity changes as high as +8.06 kcal/mol and +7.23 kcal/mol respectively. Combined with transmembrane region analysis, we noticed that except for I256M, mutations at the other five sites had a negative effect on the binding of peptide antigen to MHC molecule, and the I256M, V351A, V354A and L389V mutations had a more significant effect on the decrease in affinity (). Since our focus is on the binding of antigenic peptides to MHC molecules, we chose mutations with affinity changes greater than +3kcal/mol, and finally collected 11 mutation sites, L149V, I156L, L213F, I256M, V351A, V354A, M372V/I, L389V, I155V and V149V. Among these mutation sites, antigenic peptides predicted by L213F, I256M, V149A and I155V were not present in the hepitopes database (Lumley et al., Citation2016). After that, molecular dynamics simulation was carried out on the corresponding complex, and the equilibrium trajectories were selected according to the RMSD analysis and the MM/GBSA calculation was performed to study the influence of the mutation.

The results showed that these mutations led to a decrease in binding free energy to varying degrees. For example, the binding free energy of I155V was reduced from 129.12 kcal/mol to 89.35 kcal/mol, which affected the binding of peptides to MHC molecules. At the same time, we combined the calculated results with transmembrane analysis and found that four of six mutations located in the transmembrane region were found to cause a decrease in the efficiency of presentation of antigenic peptides by MHC. Therefore, we supposed that mutations in the transmembrane region may play a more important and should be taken into account in the study of the immune escape mechanism of hepatitis B virus. Actually, it has been reported in the literature that by testing the mutations of the three variant antigenic peptides sequences, it is found that the natural single residue exchange of the HBsAg peptide antigen sequence IVSPFIPLL (L389V) bound to mouse or human HLA-A*0201 class molecules can change the immunogenicity of CD8+T cells (Riedl et al., Citation2006). The experimental results found that the mutation induced the efficient initiation of different but cross-reactive CD8+ T cell populations, and different T cell populations were specifically activated. Therefore, it is speculated that mutations may participate in cross-reaction in addition to cause instability of the peptide-MHC complex. It has also discovered that immune escape mutation may promote the spread of HBV and increase the adaptability of drug-resistant strains in vaccinated individuals (Luna, Citation2018).

The predicted scores of some peptides may differ from the experimental data, which may be due to the software is only based on affinity and mass spectral peptide data. While in the actual biological process, the stability of MHC complex, TAP transport, shear and other related processes may be affected by many factors. Although, the predicted affinity does not represent the actual situation of intermolecular interactions, it can still provide clues in understand the laws of immunology. As well as, the use of bioinformatic methods to study the affinity of peptides and HLA molecules also could provide a new insight to investigate the peptide vaccines, improves the efficiency of epitope peptides modification, which can greatly improve research efficiency. In summary, we identified important potential candidate mutations in large and comprehensive data collection that related to immune escape in Chinese, European and American populations through sequence and structural analysis, and mutations reduce the efficiency of MHC molecules to present antigenic peptides by altering the antigenic peptides produced by HBV viruses, which explains the mechanism that might leads to immune escape and provide a new insight for immunotherapy from the structural aspect. On the basis of the above work, we will continue to explore the binding mechanism between other MHC subtypes and different HBV genotypes in the future.

Authors’ contributions

JL and XL conceived this work, JL and WX supervised this work. SLG conducted data collection and data analysis. LL, XYL and XL assisted the data collection. SLG drafted the manuscript. JL, RK, JQZ, JCD and WX modified the manuscript. All authors read and approved the manuscript.

Abbreviations
HBV=

Hepatitis B virus

HLA=

Human lymphocyte antigen

MHC=

Major histocompatibility complex

MM/GBSA=

Generalized Bonn surface area based on molecular mechanics

NTCP=

sodium-taurocholate cotransporting polypeptide

RMSD=

Root-Mean-Square Deviation

TCR=

T cell receptor

Supplemental material

Availability of data and materials

All sequences involved in this manuscript are available in HBVdb database (https://hbvdb.ibcp.fr/) and AFND (http://www.allelefrequencies.net/). Structure information could acquire from RCSB PDB (https://www.rcsb.org/).

Disclosure statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Funding

This study is mainly supported by National Natural Science Foundation of China (31871322) also from National Natural Science Foundation of China (31900473, 31430035, 31171041), Jiangsu innovative and entrepreneurial talent program (KY216R201805), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (18KJB180015), Nanjing Medical University Science and Technology Development Fund (2016NJMUZD003), and the Fundamental Research Funds for the Central Universities (2242017K3DN23, 2242017K41041). The funding bodies were not involved in the design of the study, collection, analysis or interpretation of data.

References