ISSN: 2379-1764
Research Article - (2017) Volume 5, Issue 1
Sudan Ebola virus is single stranded negative sense RNA genome belonging to Filovirus Filoviridae family that causes hemorrhagic fever. There is no treatment or vaccine for it, thus the aim of this study is to design a peptide vaccine using immuoinformatics approaches to analyse the glycoprotein of the all strain of SUDV, to determine the conserved region which is further studied to predict all possible epitopes that can be used as a peptide vaccine. A total of 21 Sudan Ebola virus glycoprotein retrieved from NCBI database were aligned to determine the conservancy and to predict the epitopes using IEDB analysis resource. Three epitopes predicted as a peptide vaccine for B cell (PPPPDGVR, ETFLQSPP, LQSPPIRE). For T cell four epitopes showed high affinity to MHC class I (FLYDRLAST, IIIAIIALL, MHNQNALVC and RTYTILNRK) and high coverage against Sudan and the whole world population. Also in MHC class II, Four epitopes that interact with most frequent MHC class II alleles (FAEGVIAFL, FLRATTELR, FLYDRLAST and FVWVIILFQ) with high coverage against Sudan and the whole world population. We recommend in vivo and in vitro study to prove the effectiveness of these predicted epitopes as a peptide vaccine.
Keywords: Sudan ebola virus (SUDV), Epitope, Peptide vaccine, Immune epitope database (IEDB)
Ebola virus is belonging to Filoviruses Filoviridae family which is zoonotic pathogen that causes hemorrhagic fever for both human and nonhuman primate with high rate of death that exceeded 80% [1-8]. The first appearance of Ebola virus in Sudan, Yambuku, Nzara and Democratic Republic of Congo was in 1976 than it spread into a village near the Ebola River [2,4].
The first outbreak of Ebola virus was in Sudan, specifically in Nzara town in southern Sudan; as it started from a cotton factory and spread rapidly as a result of transmission from person to person of 15 generations leading to 284 infected individuals with 151 deaths. The second one was in Zaire (Democratic Republic of Congo) with fatality rate of 88%.
Ebola virus generally composed of single stranded negative sense RNA genome encoding a nucleoprotein (NP), viral proteins, a glycoprotein (GP) and the viral RNA-dependent RNA polymerase (L) [6].
The main Ebola virus glycoprotein (GP) is the only viral protein responsible for the attachment and immune response in the host cells which is found on the surface of the virus thus it's the main target for designing a vaccine, GP post-translationally yield GP1 and GP2 subunits [9-16].
Many studies shows that the GP plays an important role in Ebola virus infection by targeting the virus to the cells and allowing it to introduce its content into monocytes or macrophages which may lead to release of inflammatory cytokines [17]. Ebola virus stimulate immune system and inflammatory response at the same time leading to release of tumor necrosis factor (TNF) and interferon-γ (IFNγ) which in turn can disrupt some body tissues [18,19].
The first successful vaccine for Ebola virus developed in guinea pig using plasmid DNA, GP and sGP enhance cytotoxic and humoral responses but the efficacy of this DNA vaccine has been less effective in humans [17].
Our aim is to design a vaccine for Ebola virus using peptide of its glycoprotein as an immunogen to stimulate protective immune response. Survivors show high level of IgM and IgG response to antigen, a Russian investigator developed hyper immune horse serum and it was effective in baboons and guinea pigs but not in Cynomolgus monkeys. In addition, horse antibodies are not preferred for humans as some subclass of its IgG is immunogenic to humans [18].
Vaccine production that depends on biochemical experiments can be expensive, time consuming and not always work, although this vaccine formulation of attenuated or inactivated form of microorganism contains a few hundred of unnecessary proteins for the induction of immunity, that may cause allergenic or reactogenic responses [20,21].
Therefore, in silico prediction of epitopes of appropriate protein residues would help in production of peptide vaccine with powerful immunogenic and minimal allergenic effect [22,23]. This is the first study conducted to design a peptide vaccine against Sudan Ebola virus using an immunoinformatics approaches.
Protein sequence retrieval
A total of 21 Sudan Ebola virus strains’ glycoprotein was retrieved from NCBI (http://www.ncbi.nlm.nih.gov/protein/?term=sudan+ebola +virus+glycoprotein) database in June 2016. These 21 strains sequences retrieved are from different parts of the world (include 11 collected from Uganda and 4 from Sudan). Retrieved glycoprotein strains and their accession numbers and area of collection are listed in (Table 1).
Phylogenetic and alignment
The retrieved sequences were conducted in Phylogenetic and alignment study to determine the common ancestor of each strain and the conservancy using different tools from (http://www.phylogeny.fr) [24]. The phylogenetic tree and alignment were presented in Figures 1 and 2.
Determination of conserved regions
The retrieved sequences were aligned to obtain conserved regions using multiple sequence alignment (MSA). Sequences aligned with the aid of ClustalW as implemented in the BioEdit program, version 7.0.9.0 [25] for finding the conserved regions among Ebola spike glycoprotein variants. Later on, the candidate epitopes were analyzed by different prediction tools from Immune Epitope Database IEDB analysis resource (http://www.iedb.org/) [25,26].
B-cell epitope prediction
B cell epitope is the portion of an immunogen, which interacts with B lymphocytes. As a result, the B-lymphocyte is differentiated into antibody-secreting plasma cell and memory cell. B cell epitope is characterized by being accessible and antigenic [27]. Thus, the classical propensity scale methods and hidden Markov model programmed softwares from IEDB analysis resource were used for the following aspects:
Prediction of linear B-cell epitopes: BepiPred from immune epitope database (http://toolsiedb.ofg/bcell/) [28] was used as linear B-cell epitopes prediction from the conserved region with a default threshold value of 0.35.
Prediction of surface accessibility: By using Emini surface accessibility prediction tool of the immune epitope database (IEDB) (http://tools.immuneepitope.org/tools/bcell/iedb) [29]. The surface accessible epitopes were predicted from the conserved region holding the default threshold value 1.000.
Prediction of epitopes antigenicity sites: (http://tools. immuneepitope.org/bcell/) [30] the kolaskar and tongaonker antigenicity method was used to determine the antigenic sites with a default threshold value of 1.016.
MHC class I binding predictions
Analysis of peptide binding to MHC class I molecules was assessed by the IEDB MHC I prediction tool at http://tools.iedb.org/mhci/ n,MHC-I peptide complex presentation to T lymphocytes undergo several steps. The attachment of cleaved peptides to MHC molecules step was predicted. Prediction methods can be achieved by Artificial Neural Network (ANN), Stabilized Matrix Method (SMM) or Scoring Matrices derived from Combinatorial Peptide Libraries, ANN method was used [31-35]. Prior to prediction, all epitope lengths were set as 9mers, all conserved epitopes that bind to alleles at score equal or less than 100 half-maximal inhibitory concentration (IC50) is selected for further analysis [36].
MHC class II binding predictions
Analysis of peptide binding to MHC class II molecules was assessed by the IEDB MHC II prediction tool at http://tools.immuneepitope. org/mhcii/ [37,38]. For MHC-II binding predication, human allele references set were used. MHC class II groove has the ability to bind to peptides with different lengths. This variability in binding makes prediction as difficult as less accurate [39]. There are five prediction methods for IEDB MHC II prediction tool; SMM_align, NN- align, Compinatorial Libraries, Sturniolo's method and NetMHCIIpan in addition to the consensus method. SMM-align is a matrix-based method with extensions incorporating flanking residues outside of binding grooves, NN-align uses the artificial neural networks that allows for simultaneous identification of the MHC class II binding core epitopes and binding affinity, Compinatorial Libraries apply positional scanning combinatorial libraries approach which utilizes a pool of random peptide libraries to systematically measure the contribution to MHC binding from each amino acid at each of the nine positions at the binding peptide, Sturniolo's method and NetMHCIIpan predict peptide binding to HLA-DR molecule which make them less useful. The consensus approach combine the outcome of the three SMMalign, NN-align, Compinatorial Libraries methods which firstly run a random scan of Swiss-Prot proteins and achieve scores for 2,000,000 random peptides, thereafter, act as reference to rank new predictions. The consensus method uses the median rank of the three approaches as the final prediction score [40]. NN-algin method was used to predict MHC class II epitopes. All conserved epitopes that bind to many alleles at score equal or less than 1000 half-maximal inhibitory concentration (IC50) is selected for further analysis.
Accession Number | Date of collection | Country |
---|---|---|
ACR33190 | 1976 | Sudan |
ABY75325 | 2004 | Sudan |
Q66798 | 1996 | Sudan |
AAB37096 | 1996 | Sudan |
AAC54882 | 1996 | Sudan |
ALT19781 | 2000 | Sudan |
AFP28231 | 2011 | Uganda |
AAR11463 | 2000 | Uganda |
*YP_138523 | 2000 | Uganda |
AAP88031 | 2000 | Uganda |
ALL26375 | 2015 | Canada |
AGB56678 | 1979 | Sudan |
AKB09538 | 2000 | Uganda |
AAU43887 | 2000 | Uganda |
ALH21228 | 1976 | Sudan |
AGL73446 | 2012 | Uganda |
AGL73439 | 2012 | Uganda |
AGL73432 | 2012 | Uganda |
AGL73425 | 2012 | Uganda |
AGL50928 | 2012 | Uganda |
Q7T9D9 | 2012 | Uganda |
*Ref sequence.
Table 1: Virus strains retrieved and their accession numbers and area of collection.
Population coverage calculation
All potential MHC I and MHC II binders of Sudan Ebola virus glycoprotein were assessed for population coverage against the whole world population and Sudan population with the selected MHC-I and MHC-II interacted alleles by the IEDB population coverage calculation tool at http://tools.iedb.org/tools/population/iedb_input [41].
Phylogenetic
The phylogenetic tree revealed that the strains of SUDV that collected from Uganda in 2012, 2012 (1), 2012 (3) and 2011 could be the same one, while the one that collected from Canada 2015 could be the same strain of Sudan 1976.
Alignment
Prediction of B-cell epitope: The reference glycoprotein (GP) was subjected to Bepipred linear epitope, Emini surface accessibility and Kolaskar and Tongaonkar antigenicity methods in IEDB, that predict the probability of specific regions in the protein to bind to B cell receptor, being in the surface and immunogenic, respectively.
In Bepipred Linear Epitope Prediction method; the average binders score of Glycoprotein to B cell was 0.267, with a maximum of 3.228 and a minimum of -3.132, thirty six epitopes were predicted eliciting B lymphocyte from the conserved regions and all values equal or greater than the default threshold 0.35. In Emini surface accessibility prediction; the average surface accessibility areas of the protein was scored as 1.000, with a maximum of 8.153 and a minimum of 0.030, twenty five epitopes were potentially in the surface by passing the default threshold 1.0.
In Kolaskar and Tongaonkar antigenicity; the average of the antigenicity was 1.016, with a maximum of 1.293 and minimum of 0.848, eight epitopes gave score above the default threshold 1.016. However, there are three epitopes successfully overlapped the three tools (PPPPDGVR, ETFLQSPP, LQSPPIRE). The result is illustrated in Table 2 below and Figures 3-5, and their positions in the structural level are shown in Figure 6.
Prediction of cytotoxic T-lymphocyte epitopes and interaction with MHC class I:
The reference glycoprotein strain was analyzed using IEDB MHC-1 binding prediction tool to predict T cell epitope suggested interacting with different types of MHC Class I alleles, based on Artificial Neural Network (ANN) with half-maximal inhibitory concentration (IC50) ≤ 100; 65 peptides were predicted to interact with different MHC-1 alleles. The peptide RTYTILNRK from 580 to 588 had higher affinity to interact with 5 alleles (HLA-A*03:01, HLA-A*30:01, HLA-A*11:01, HLA-A*31:01, HLA-A*68:01), followed by RLASTVIYR from 164 to 172, and YTENTSSYY from 205 to 213 that had affinity to interact with 4 alleles for each. The epitopes and their corresponding MHC- 1 alleles are shown in Table 3. Their positions in structural level are shown in Figure 7.
Prediction of T helper cell epitopes and interaction with MHC class II
The reference glycoprotein (GP) strain was analyzed using IEDB MHC-II binding prediction tool based on NN-align with half-maximal inhibitory concentration (IC50) ≤ 1000; there were 116 predicted epitopes found to interact with MHC-II alleles. The peptide (core) FLRATTELR had high affinity to interact with twenty two alleles; HLA-DPB1*04:01, HLA-DPB1*02:01, HLA-DPB1*05:01, HLADPB1* 04:02, HLA-DPA1*01, HLA-DPA1*02:01, HLA-DPA1*03:01, HLA-DQA1*05:01, HLA-DQB1*02:01, HLA-DQB1*03:01,HLADRB1* 01:01, HLA-DRB1*03:01, HLA-DRB1*04:05, HLA-DRB1*07:01, HLA-DRB1*08:02, HLA-DRB1*04:01, HLA-DRB1*04:01, HLADRB1* 09:01, HLA-DRB1*11:01, HLA-DRB1*11:01, HLADRB4* 01:01, HLA-DRB5*01:01. The results of top four epitopes are listed in Table 4 below and their positions are shown in Figure 8.
Analysis of the population coverage
Epitopes of glycoprotein (GP) that are suggested interacting with MHC-I and II alleles (especially high affinity binding epitopes and that can bind to different set of alleles) were selected for population coverage analysis. The results of population coverage of all epitopes in Sudan and world are listed in Table 5.
In MHC class I, Four epitopes that interact with most frequent MHC class I alleles (FLYDRLAST, IIIAIIALL, MHNQNALVC and RTYTILNRK) gave high percentage against Sudan and the whole world population by IEDB population coverage tool. The maximum population coverage percentage of these epitopes in World was 46.73% for FLYDRLAST and in Sudan was 67.96% for MHNQNALVC.
Also in MHC class II, Four epitopes that interact with most frequent MHC class II alleles (FAEGVIAFL, FLRATTELR, FLYDRLAST and FVWVIILFQ) gave high percentage against Sudan and the whole world population by IEDB population coverage tool. The maximum population coverage percentage of these epitopes in World was 99.72% for FVWVIILFQ and in Sudan was 97.36% for FLRATTELR. The result of population coverage of proposed epitopes in Sudan and whole word are listed in Table 6.
Epitope | Start | End | Length | Surface | Antigenicity |
---|---|---|---|---|---|
accessibilitya | scoreb | ||||
1*GSGVSTDIPSATKRWGFRSGVPP | 72 | 94 | 23 | 0.291 | 1.003 |
1*VSTDIPSATKR | 75 | 85 | 11 | 1.091 | 1.016 |
VSYEAGEWAE | 97 | 106 | 10 | 0.614 | 1 |
2*KKPDGSECLPPPPDGVRG | 114 | 131 | 18 | 1.369 | 1.018 |
2*PPPPDGVR | 123 | 130 | 8 | 1.669 | 1.031 |
KAQGTGPCPGD | 140 | 150 | 11 | 0.574 | 0.995 |
3*ETFLQSPPIREA | 191 | 202 | 12 | 1.009 | 1.016 |
3*ETFLQSPP | 191 | 198 | 8 | 1.204 | 1.032 |
3*LQSPPIRE | 194 | 201 | 8 | 1.323 | 1.035 |
NYTENTSSYY | 204 | 213 | 10 | 4.602 | 0.973 |
FGAQ | 225 | 228 | 4 | 0.531 | 1.011 |
RPHT | 246 | 249 | 4 | 2.108 | 0.988 |
KNL | 295 | 297 | 3 | 1.218 | 0.985 |
QLR | 300 | 302 | 3 | 1.285 | 1.046 |
NETEDDDA | 314 | 321 | 8 | 3.981 | 0.881 |
SSR | 323 | 325 | 3 | 1.616 | 0.966 |
GRISDRATR | 329 | 337 | 9 | 1.581 | 0.944 |
DLVPK | 341 | 345 | 5 | 0.857 | 1.099 |
PGM | 348 | 350 | 3 | 0.696 | 0.921 |
PEGETTLPSQNSTEGRRV | 356 | 373 | 18 | 4.124 | 0.964 |
VNTQETITE | 375 | 383 | 9 | 1.214 | 0.973 |
SSSQI | 406 | 410 | 5 | 0.792 | 1.041 |
SSSPT | 412 | 416 | 5 | 1.456 | 1.002 |
SPE | 420 | 422 | 3 | 1.648 | 0.979 |
TEE | 438 | 440 | 3 | 1.988 | 0.87 |
TTPP | 442 | 445 | 4 | 1.765 | 0.987 |
SPG | 448 | 450 | 3 | 0.942 | 0.983 |
TTEAPTLTTPENITT | 452 | 466 | 15 | 1.741 | 0.962 |
QESTSNGL | 474 | 481 | 8 | 1.24 | 0.962 |
SRRQ | 499 | 502 | 4 | 3.156 | 0.943 |
ATGKCNP | 507 | 513 | 7 | 0.609 | 1.004 |
AQEQHNA | 520 | 526 | 7 | 1.835 | 0.984 |
FGPGAEGIY | 535 | 543 | 9 | 0.232 | 1.001 |
CIE | 609 | 611 | 3 | 0.299 | 1.138 |
HDWTKN | 613 | 618 | 6 | 2.292 | 0.913 |
NPLPNQDNDDNWWT | 633 | 646 | 14 | 4.322 | 0.914 |
2* peptide from 114 to 131 gives higher score if it is shorten (123 to 130) in all tools
3* peptide from 191 to 202 gives higher score if it is shorten (191 to 198) or (194 to 201) in all tools
a: default threshold value 1.000
b: default threshold value 1.016
Position of peptides is according to position of amino acids in the glycoprotein (GP).
Table 2: B-cell epitopes prediction.
Various studies support the assumption that a strong, specific and adaptive immune response is needed to survive from Ebola virus infection, as well as balanced response with respect to both humoral and cell mediated immunity [6,19,42,43]. Several vaccine attempts are in clinical trials now or preparing to; plasmid cocktail coding GP gene of EPOV, SUDV or both as well as NP gene of EPOV were used for vaccination of SUDV, although they were immunogenic at high doses and failed to induce robust cellular immunity. As well as recombinant viruses with different types of vectors that have been shown to confer protection against SUDV in nonhuman primate, or virus like particles (VLP) that provide additional advantage as safety administered by immunosuppressed individuals. In general, these studies are hopeful, but improvement is needed to achieve better outcomes [44-53].
To our knowledge, there is no peptide prediction has been conducted specifically for this virus so far. Peptide vaccination is a key role of combining a good desirable immune response and a minimal immunological side effect. There are many peptide vaccines under development, such as vaccine for human immunodeficiency virus (HIV), hepatitis C virus (HCV), malaria, foot and mouth disease, swine fever, influenza, anthrax, human papilloma virus (HPV), therapeutic anti-cancer vaccines, pancreatic cancer, melanoma, non-small cell lung cancer, advanced hepatocellular carcinoma, cutaneous T-cell lymphoma and B-Cell chronic lymphocytic leukaemia [54-67].
In this study, we aimed to determine the 100% conserved regions which are then investigated to predict the highly potential immunogenic epitopes for both B and T cells - the prime molecules of cell mediated and humoral immunity as vaccine candidates for the highly lethal SUDV infection using Spike glycoprotein(GP) as a target. SUDV GP is the key of cell attachment, entry and infectivity of the virus. Several recent studies conclude the ability of GP of SUDV alone to induce strong humoral and cellular immune response against Sudan Ebola Virus [6,68-70].
Figure 3: Bepipred linear epitope prediction.
Yellow areas above threshold (red line) are proposed to be a part of B cell epitope. While green areas are not.
Epitope | Start | End | Allele | ANN-ic50* | Percentile Rank |
---|---|---|---|---|---|
AAGIAWIPY | 526 | 534 | HLA-B*35:01 | 37 | 1 |
AEGVIAFLI | 177 | 185 | HLA-B*40:01 | 39 | 0.7 |
HLA-B*40:02 | 49 | 0.7 | |||
AENCYNLEI | 105 | 113 | HLA-B*40:01 | 27 | 0.5 |
HLA-B*40:02 | 61 | 0.8 | |||
HLA-B*44:02 | 18 | 0.2 | |||
ATSYLEYEI | 214 | 222 | HLA-A*68:02 | 68 | 1.7 |
HLA-A*32:01 | 90 | 0.7 | |||
DAASSRITK | 320 | 328 | HLA-A*68:01 | 15 | 0.4 |
DGAFFLYDR | 156 | 164 | HLA-A*68:01 | 63 | 1.3 |
EPHDWTKNI | 611 | 619 | HLA-C*12:03 | 72 | 2.1 |
ETFLQSPPI | 191 | 199 | HLA-A*68:02 | 8 | 0.4 |
ETTQALQLF | 564 | 572 | HLA-A*26:01 | 11 | 0.2 |
EVTEIDQLV | 44 | 52 | HLA-A*68:02 | 4 | 0.2 |
FAEGVIAFL | 176 | 184 | HLA-A*68:02 | 96 | 2.1 |
HLA-A*02:06 | 50 | 2.4 | |||
HLA-C*12:03 | 12 | 0.5 | |||
FFVWVIILF | 19 | 27 | HLA-A*23:01 | 29 | 0.3 |
HLA-A*29:02 | 73 | 0.8 | |||
FLFQLNDTI | 252 | 260 | HLA-A*02:01 | 25 | 1 |
HLA-A*02:06 | 67 | 3 | |||
HLA-C*12:03 | 15 | 0.6 | |||
FLRATTELR | 572 | 580 | HLA-A*68:01 | 98 | 1.7 |
FLYDRLAST | 160 | 168 | HLA-A*02:01 | 11 | 0.5 |
HLA-A*02:06 | 7 | 0.6 | |||
HLA-C*12:03 | 48 | 1.8 | |||
FSMPLGVVT | 31 | 39 | HLA-C*12:03 | 68 | 2.1 |
GLMHNQNAL | 546 | 554 | HLA-A*02:01 | 84 | 2.2 |
GTGPCPGDY | 143 | 151 | HLA-A*30:02 | 31 | 0.4 |
GVIAFLILA | 179 | 187 | HLA-A*02:06 | 30 | 1.8 |
GVRGFPRCR | 128 | 136 | HLA-A*30:01 | 85 | 1.7 |
HLASTDQLK | 56 | 64 | HLA-A*68:01 | 76 | 1.4 |
HTPQFLFQL | 248 | 256 | HLA-A*68:02 | 37 | 1.3 |
IALLCVCKL | 666 | 674 | HLA-C*12:03 | 58 | 1.9 |
IHDFIDNPL | 627 | 635 | HLA-B*39:01 | 71 | 0.9 |
IIALLCVCK | 665 | 673 | HLA-A*11:01 | 57 | 1.2 |
HLA-A*68:01 | 66 | 1.3 | |||
IIIAIIALL | 661 | 669 | HLA-A*02:01 | 40 | 1.4 |
HLA-A*68:02 | 38 | 1.3 | |||
HLA-A*02:06 | 41 | 2.2 | |||
ILGSLGLRK | 489 | 497 | HLA-A*03:01 | 40 | 0.3 |
KAIDFLLRR | 588 | 596 | HLA-A*11:01 | 50 | 1.1 |
HLA-A*31:01 | 33 | 0.9 | |||
KCNPNLHYW | 510 | 518 | HLA-B*58:01 | 27 | 0.5 |
HLA-B*57:01 | 32 | 0.2 | |||
KFRKSSFFV | 13 | 21 | HLA-A*30:01 | 3 | 0.2 |
KINQIIHDF | 622 | 630 | HLA-A*32:01 | 32 | 0.4 |
KRWGFRSGV | 84 | 92 | HLA-B*27:05 | 23 | 0.2 |
KSSFFVWVI | 16 | 24 | HLA-A*32:01 | 10 | 0.2 |
HLA-B*58:01 | 10 | 0.2 | |||
KSSFFVWVI | 16 | 24 | HLA-C*15:02 | 87 | 0.9 |
LAKPKETFL | 186 | 194 | HLA-C*12:03 | 79 | 2.3 |
LANETTQAL | 561 | 569 | HLA-B*35:01 | 11 | 0.4 |
HLA-C*12:03 | 24 | 0.9 | |||
LMHNQNALV | 547 | 555 | HLA-A*02:01 | 46 | 1.5 |
LQLPRDKFR | 7 | 15 | HLA-A*31:01 | 46 | 1.1 |
MHNQNALVC | 548 | 556 | HLA-B*39:01 | 45 | 0.7 |
HLA-C*06:02 | 68 | 0.4 | |||
HLA-C*07:01 | 41 | 0.5 | |||
NADIGEWAF | 282 | 290 | HLA-B*35:01 | 22 | 0.7 |
NFAEGVIAF | 175 | 183 | HLA-B*35:01 | 31 | 0.8 |
NPNLHYWTA | 512 | 520 | HLA-B*08:01 | 87 | 0.6 |
NQNALVCGL | 550 | 558 | HLA-A*02:06 | 97 | 3.5 |
HLA-B*39:01 | 37 | 0.6 | |||
QLRGEELSF | 300 | 308 | HLA-B*15:01 | 94 | 1.3 |
RLASTVIYR | 164 | 172 | HLA-A*03:01 | 49 | 0.4 |
HLA-A*11:01 | 43 | 0.9 | |||
HLA-A*31:01 | 6 | 0.2 | |||
HLA-A*68:01 | 73 | 1.4 | |||
RPHTPQFLF | 246 | 254 | HLA-B*07:02 | 31 | 0.5 |
RRWGGTCRI | 595 | 603 | HLA-B*27:05 | 21 | 0.2 |
RTYTILNRK | 580 | 588 | HLA-A*03:01 | 22 | 0.2 |
HLA-A*30:01 | 15 | 0.5 | |||
HLA-A*11:01 | 15 | 0.2 | |||
HLA-A*31:01 | 12 | 0.4 | |||
HLA-A*68:01 | 48 | 1.1 | |||
SATKRWGFR | 81 | 89 | HLA-A*31:01 | 24 | 0.8 |
HLA-A*68:01 | 67 | 1.3 | |||
SSFFVWVII | 17 | 25 | HLA-A*68:02 | 20 | 0.8 |
HLA-A*32:01 | 74 | 0.7 | |||
SSYYATSYL | 210 | 218 | HLA-A*68:02 | 21 | 0.8 |
HLA-C*15:02 | 30 | 0.3 | |||
STDIPSATK | 76 | 84 | HLA-A*11:01 | 33 | 0.8 |
TELRTYTIL | 577 | 585 | HLA-B*40:01 | 13 | 0.3 |
HLA-B*40:02 | 73 | 0.8 | |||
TPENITTAV | 460 | 468 | HLA-B*07:02 | 75 | 0.9 |
TQALQLFLR | 566 | 574 | HLA-A*31:01 | 42 | 1.1 |
HLA-A*68:01 | 85 | 1.6 | |||
TSSYYATSY | 209 | 217 | HLA-B*15:01 | 60 | 0.9 |
TTELRTYTI | 576 | 584 | HLA-A*32:01 | 65 | 0.6 |
TTPENITTA | 459 | 467 | HLA-A*68:02 | 64 | 1.7 |
HLA-A*68:02 | 22 | 0.9 | |||
VIAFLILAK | 180 | 188 | HLA-A*03:01 | 33 | 0.3 |
HLA-A*11:01 | 17 | 0.3 | |||
VVTNSTLEV | 37 | 45 | HLA-A*02:06 | 39 | 2.1 |
WTKNITDKI | 615 | 623 | HLA-A*68:02 | 65 | 1.7 |
YEIENFGAQ | 220 | 228 | HLA-B*18:01 | 33 | 0.3 |
YTENTSSYY | 205 | 213 | HLA-A*01:01 | 6 | 0.2 |
HLA-A*29:02 | 56 | 0.8 | |||
HLA-A*30:02 | 45 | 0.5 | |||
HLA-C*12:03 | 84 | 2.4 | |||
YTILNRKAI | 582 | 590 | HLA-C*12:03 | 19 | 0.8 |
YYATSYLEY | 212 | 220 | HLA-A*29:02 | 3 | 0.2 |
*ANN ic50 is the inhibitory concentration needed for successful binding of peptide to MHC molecule by the Artificial Neural Network method. The lower number of epitope
is the better
Position of peptides is according to position of amino acids in the glycoprotein (GP).
Table 3: List of epitopes that had binding affinity with the MHC Class I alleles.
Core Sequence | Start | End | Peptide Sequence | Allele | IC50 | Rank |
---|---|---|---|---|---|---|
FAEGVIAFL | 176 | 184 | FAEGVIAFLILAKPK | HLA-DPA1*01:03/DPB1*02:01 | 448.4 | 20.28 |
HLA-DQA1*01:01/DQB1*05:01 | 525.3 | 9.38 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 88.1 | 13.65 | ||||
HLA-DRB1*04:05 | 569.2 | 31.35 | ||||
HLA-DRB1*07:01 | 458 | 32.04 | ||||
HLA-DRB1*04:01 | 528.3 | 29.85 | ||||
HLA-DRB1*09:01 | 658.2 | 30.42 | ||||
GVNFAEGVIAFLILA | HLA-DPA1*01/DPB1*04:01 | 644.9 | 17.17 | |||
HLA-DPA1*01:03/DPB1*02:01 | 215.1 | 13.45 | ||||
HLA-DPA1*02:01/DPB1*05:01 | 659.3 | 13.01 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 385.5 | 7.44 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 502.5 | 11.22 | ||||
HLA-DRB1*04:05 | 503.4 | 29.39 | ||||
HLA-DRB1*07:01 | 129.3 | 16.67 | ||||
HLA-DRB1*08:02 | 903 | 20.71 | ||||
HLA-DRB1*04:01 | 269.9 | 18.87 | ||||
HLA-DRB1*09:01 | 133.6 | 9.09 | ||||
HLA-DRB5*01:01 | 733 | 38.49 | ||||
IYRGVNFAEGVIAFL | HLA-DPA1*01:03/DPB1*02:01 | 213 | 13.37 | |||
HLA-DPA1*02:01/DPB1*01:01 | 384.3 | 26.62 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 688.4 | 11.43 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 299.3 | 6.73 | ||||
HLA-DRB1*09:01 | 121.3 | 8.31 | ||||
HLA-DRB1*15:01 | 586.7 | 31.5 | ||||
NFAEGVIAFLILAKP | HLA-DPA1*01:03/DPB1*02:01 | 349.8 | 17.73 | |||
HLA-DPA1*02:01/DPB1*05:01 | 723.9 | 13.98 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 449.9 | 8.36 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 667.5 | 14.57 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 60.6 | 10.43 | ||||
HLA-DRB1*04:05 | 680.8 | 34.4 | ||||
HLA-DRB1*07:01 | 346.6 | 28.14 | ||||
HLA-DRB1*04:01 | 379.8 | 24.08 | ||||
HLA-DRB1*09:01 | 370.7 | 20.8 | ||||
RGVNFAEGVIAFLIL | HLA-DPA1*01:03/DPB1*02:01 | 180.9 | 12.11 | |||
HLA-DPA1*02:01/DPB1*05:01 | 596.5 | 12.01 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 431.2 | 8.09 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 360.6 | 8.15 | ||||
HLA-DRB1*04:05 | 479.5 | 28.62 | ||||
HLA-DRB1*07:01 | 80.3 | 12.25 | ||||
HLA-DRB1*08:02 | 586.3 | 14.2 | ||||
HLA-DRB1*04:01 | 281.1 | 19.46 | ||||
HLA-DRB1*09:01 | 131.9 | 8.99 | ||||
HLA-DRB5*01:01 | 749.8 | 38.82 | ||||
VNFAEGVIAFLILAK | HLA-DPA1*01:03/DPB1*02:01 | 227.8 | 13.91 | |||
HLA-DPA1*02:01/DPB1*05:01 10.81 | 525.8 | 10.81 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 404.3 | 7.71 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 581.7 | 12.86 | ||||
HLA-DRB1*04:05 | 566.4 | 31.27 | ||||
HLA-DRB1*07:01 | 179 | 20.16 | ||||
HLA-DRB1*04:01 | 332.6 | 21.98 | ||||
HLA-DRB1*09:01 | 219.1 | 13.93 | ||||
HLA-DRB5*01:01 | 695.1 | 37.7 | ||||
YRGVNFAEGVIAFLI | HLA-DPA1*01/DPB1*04:01 | 942.1 | 21.11 | |||
HLA-DPA1*01:03/DPB1*02:01 | 218.3 | 13.56 | ||||
HLA-DPA1*02:01/DPB1*05:01 | 823.7 | 15.43 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 526.5 | 9.39 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 835.5 | 13.31 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 323.7 | 7.29 | ||||
HLA-DRB1*04:05 | 505.5 | 29.46 | ||||
HLA-DRB1*07:01 | 71.6 | 11.33 | ||||
HLA-DRB1*08:02 | 631.6 | 15.16 | ||||
HLA-DRB1*04:01 | 312.3 | 21.01 | ||||
HLA-DRB1*09:01 | 143.5 | 9.71 | ||||
HLA-DRB5*01:01 | 766.9 | 39.16 | ||||
FLRATTELR | 572 | 580 | ALQLFLRATTELRTY | HLA-DPA1*01:03/DPB1*02:01 | 282.8 | 15.74 |
HLA-DPA1*02:01/DPB1*05:01 | 203.4 | 4.49 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 92.1 | 9.14 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 635.1 | 13.94 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 723.2 | 42.09 | ||||
HLA-DRB1*01:01 | 15.6 | 8.96 | ||||
HLA-DRB1*03:01 | 25.5 | 1.49 | ||||
HLA-DRB1*04:05 | 24.2 | 1.79 | ||||
HLA-DRB1*07:01 | 38.4 | 6.97 | ||||
HLA-DRB1*08:02 | 906.5 | 20.78 | ||||
HLA-DRB1*04:01 | 39.3 | 2.79 | ||||
HLA-DRB1*09:01 | 143.8 | 9.72 | ||||
HLA-DRB5*01:01 | 10.7 | 2.57 | ||||
FLRATTELRTYTILN | HLA-DPA1*01/DPB1*04:01 | 995.1 | 21.75 | |||
HLA-DPA1*02:01/DPB1*05:01 | 203.4 | 4.49 | ||||
HLA-DRB1*01:01 | 58 | 23.06 | ||||
HLA-DRB1*03:01 | 234.3 | 8.87 | ||||
HLA-DRB1*04:05 | 73.3 | 7.21 | ||||
HLA-DRB1*04:01 | 116.1 | 9.36 | ||||
HLA-DRB1*11:01 | 359.3 | 26.89 | ||||
HLA-DRB5*01:01 | 43.7 | 9.11 | ||||
LFLRATTELRTYTIL | HLA-DPA1*01:03/DPB1*02:01 | 431 | 19.87 | |||
HLA-DPA1*02:01/DPB1*05:01 | 156 | 3.38 | ||||
HLA-DRB1*01:01 | 30.1 | 15.51 | ||||
HLA-DRB1*03:01 | 77.9 | 4.12 | ||||
HLA-DRB1*04:05 | 50.6 | 4.92 | ||||
HLA-DRB1*04:01 | 75.4 | 6.09 | ||||
HLA-DRB1*11:01 | 208.9 | 20.75 | ||||
HLA-DRB4*01:01 | 419.1 | 24.48 | ||||
HLA-DRB5*01:01 | 24.8 | 5.9 | ||||
LQLFLRATTELRTYT | HLA-DPA1*01/DPB1*04:01 | 363.7 | 12.34 | |||
HLA-DPA1*01:03/DPB1*02:01 | 300.9 | 16.29 | ||||
HLA-DPA1*02:01/DPB1*05:01 | 182.2 | 4 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 61 | 6.76 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 836.3 | 17.68 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 844.1 | 44.84 | ||||
HLA-DRB1*01:01 | 12.3 | 6.87 | ||||
HLA-DPA1*01:03/DPB1*02:01 | 300.9 | 16.29 | ||||
HLA-DPA1*02:01/DPB1*05:01 | 182.2 | 4 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 61 | 6.76 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 836.3 | 17.68 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 844.1 | 44.84 | ||||
HLA-DRB1*01:01 | 12.3 | 6.87 | ||||
HLA-DRB1*03:01 | 18.8 | 1.06 | ||||
HLA-DRB1*04:05 | 26.3 | 2.04 | ||||
HLA-DRB1*07:01 | 45.2 | 7.97 | ||||
HLA-DRB1*08:02 | 862 | 19.92 | ||||
HLA-DRB1*04:01 | 35.3 | 2.42 | ||||
HLA-DRB1*09:01 | 127 | 8.67 | ||||
HLA-DRB5*01:01 | 10 | 2.37 | ||||
QALQLFLRATTELRT | HLA-DPA1*01:03/DPB1*02:01 | 285.5 | 15.82 | |||
HLA-DPA1*02:01/DPB1*05:01 | 352.9 | 7.66 | ||||
HLA-DQA1*05:01/DQB1*02:01 | 574.9 | 12.72 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 741.4 | 42.52 | ||||
HLA-DRB1*01:01 | 23.5 | 12.91 | ||||
HLA-DRB1*03:01 | 45.1 | 2.64 | ||||
HLA-DRB1*04:05 | 24.9 | 1.88 | ||||
HLA-DRB1*07:01 | 36.5 | 6.69 | ||||
HLA-DRB1*04:01 | 49.2 | 3.74 | ||||
HLA-DRB1*09:01 | 207.8 | 13.36 | ||||
HLA-DRB5*01:01 | 13.4 | 3.3 | ||||
QLFLRATTELRTYTI | HLA-DPA1*01/DPB1*04:01 | 414.6 | 13.35 | |||
HLA-DPA1*02:01/DPB1*05:01 | 150 | 3.24 | ||||
HLA-DRB1*01:01 | 15.8 | 9.07 | ||||
HLA-DRB1*03:01 | 35 | 2.09 | ||||
HLA-DRB1*04:05 | 33.1 | 2.88 | ||||
HLA-DRB1*08:02 | 896.3 | 20.58 | ||||
HLA-DRB1*04:01 | 47.1 | 3.53 | ||||
HLA-DRB1*11:01 | 113.9 | 14.85 | ||||
HLA-DRB5*01:01 | 14.9 | 3.69 | ||||
TQALQLFLRATTELR | HLA-DPA1*01/DPB1*04:01 | 714.7 | 18.16 | |||
HLA-DQA1*05:01/DQB1*02:01 | 636.2 | 13.96 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 786.5 | 43.58 | ||||
HLA-DRB1*01:01 | 35.7 | 17.36 | ||||
HLA-DRB1*03:01 | 73.6 | 3.94 | ||||
HLA-DRB1*04:05 | 25.1 | 1.9 | ||||
HLA-DRB1*07:01 | 40.4 | 7.26 | ||||
HLA-DRB1*04:01 | 60.1 | 4.75 | ||||
HLA-DRB1*09:01 | 320.7 | 18.66 | ||||
HLA-DRB5*01:01 | 16 | 3.96 | ||||
FLYDRLAST | 160 | 168 | AFFLYDRLASTVIYR | HLA-DPA1*01:03/DPB1*02:01 | 4.2 | 0.3 |
HLA-DPA1*02:01/DPB1*01:01 | 26.8 | 2.45 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 6.8 | 0.39 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 118.6 | 16.59 | ||||
HLA-DRB1*03:01 | 18.7 | 1.06 | ||||
HLA-DRB1*04:05 | 72.6 | 7.15 | ||||
HLA-DRB1*08:02 | 251 | 5.78 | ||||
HLA-DRB1*04:01 | 27.8 | 1.69 | ||||
HLA-DRB3*01:01 | 19.7 | 1.22 | ||||
HLA-DRB5*01:01 | 158.6 | 19.59 | ||||
DGAFFLYDRLASTVI | HLA-DPA1*01:03/DPB1*02:01 | 3.4 | 0.18 | |||
HLA-DPA1*02:01/DPB1*01:01 | 22 | 1.84 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 141.4 | 18.46 | ||||
HLA-DRB1*03:01 | 23 | 1.33 | ||||
HLA-DRB1*04:05 | 64.4 | 6.34 | ||||
HLA-DRB1*08:02 | 295.1 | 6.95 | ||||
HLA-DRB1*04:01 | 26.3 | 1.56 | ||||
HLA-DRB3*01:01 | 12.2 | 0.7 | ||||
HLA-DRB5*01:01 | 147.3 | 18.86 | ||||
FFLYDRLASTVIYRG | HLA-DPA1*03:01/DPB1*04:02 | 10.9 | 0.98 | |||
HLA-DQA1*05:01/DQB1*03:01 | 136 | 18.04 | ||||
HLA-DRB1*03:01 | 27.8 | 1.62 | ||||
HLA-DRB1*04:05 | 109.2 | 10.38 | ||||
HLA-DRB1*08:02 | 195.3 | 4.25 | ||||
HLA-DRB1*04:01 | 43 | 3.13 | ||||
HLA-DRB3*01:01 | 39.4 | 2.32 | ||||
HLA-DRB5*01:01 | 226.2 | 23.35 | ||||
FLYDRLASTVIYRGV | HLA-DPA1*01/DPB1*04:01 | 520.8 | 15.25 | |||
HLA-DPA1*01:03/DPB1*02:01 | 170.6 | 11.68 | ||||
HLA-DPA1*02:01/DPB1*01:01 | 144 | 13.97 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 66.9 | 7.26 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 163.7 | 20.14 | ||||
HLA-DRB1*03:01 | 55.8 | 3.13 | ||||
HLA-DRB1*04:05 | 138.6 | 12.6 | ||||
HLA-DRB1*08:02 | 204.4 | 4.5 | ||||
HLA-DRB1*04:01 | 64.4 | 5.13 | ||||
HLA-DRB1*11:01 | 341.1 | 26.26 | ||||
HLA-DRB3*01:01 | 73 | 3.65 | ||||
HLA-DRB5*01:01 | 370.8 | 29.14 | ||||
GAFFLYDRLASTVIY | HLA-DPA1*01:03/DPB1*02:01 | 3.3 | 0.16 | |||
HLA-DPA1*02:01/DPB1*01:01 | 23.6 | 2.04 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 6.9 | 0.4 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 126 | 17.2 | ||||
HLA-DRB1*03:01 | 13.2 | 0.64 | ||||
HLA-DRB1*04:05 | 57.6 | 5.68 | ||||
HLA-DRB1*08:02 | 209.1 | 4.63 | ||||
HLA-DRB1*04:01 | 21.5 | 1.11 | ||||
HLA-DRB3*01:01 | 11.6 | 0.64 | ||||
HLA-DRB5*01:01 | 124.3 | 17.23 | ||||
HKDGAFFLYDRLAST | HLA-DPA1*01:03/DPB1*02:01 | 4.5 | 0.34 | |||
HLA-DPA1*02:01/DPB1*01:01 | 27.6 | 2.55 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 194.9 | 22.25 | ||||
HLA-DRB1*01:01 | 61.6 | 23.82 | ||||
HLA-DRB1*03:01 | 113.2 | 5.49 | ||||
HLA-DRB1*04:05 | 113.7 | 10.74 | ||||
HLA-DRB1*08:02 | 767.5 | 18.03 | ||||
HLA-DRB1*04:01 | 69.5 | 5.59 | ||||
HLA-DRB3*01:01 | 18.1 | 1.12 | ||||
HLA-DRB5*01:01 | 263.4 | 25.08 | ||||
KDGAFFLYDRLASTV | HLA-DPA1*01:03/DPB1*02:01 | 3.6 | 0.21 | |||
HLA-DQA1*05:01/DQB1*03:01 | 171.3 | 20.68 | ||||
HLA-DRB1*01:01 | 22.8 | 12.6 | ||||
HLA-DRB1*03:01 | 47.7 | 2.75 | ||||
HLA-DRB1*04:05 | 92 | 8.95 | ||||
HLA-DRB1*07:01 | 907.8 | 42.68 | ||||
HLA-DRB1*08:02 | 483.3 | 11.75 | ||||
HLA-DRB1*04:01 | 54.2 | 4.21 | ||||
HLA-DRB3*01:01 | 14.5 | 0.86 | ||||
HLA-DRB5*01:01 | 216.7 | 22.85 | ||||
FVWVIILFQ | 20 | 28 | FFVWVIILFQKAFSM | HLA-DPA1*01/DPB1*04:01 | 178.7 | 7.94 |
HLA-DPA1*03:01/DPB1*04:02 | 42.1 | 4.95 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 676 | 10.77 | ||||
FRKSSFFVWVIILFQ | HLA-DPA1*03:01/DPB1*04:02 | 76.1 | 8.01 | |||
HLA-DQA1*03:01/DQB1*03:02 | 471.5 | 8.14 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 800.8 | 12.78 | ||||
HLA-DRB1*04:05 | 381.9 | 25.16 | ||||
FVWVIILFQKAFSMP | HLA-DPA1*01/DPB1*04:01 | 187.4 | 8.18 | |||
HLA-DPA1*03:01/DPB1*04:02 | 63.6 | 6.97 | ||||
HLA-DQA1*01:01/DQB1*05:01 | 989.5 | 14.75 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 998.4 | 15.78 | ||||
KSSFFVWVIILFQKA | HLA-DPA1*01/DPB1*04:01 | 112.7 | 5.77 | |||
HLA-DPA1*03:01/DPB1*04:02 | 36 | 4.28 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 405.3 | 6.16 | ||||
HLA-DRB1*04:05 | 484.5 | 28.79 | ||||
RKSSFFVWVIILFQK | HLA-DPA1*01:03/DPB1*02:01 | 26 | 3.02 | |||
HLA-DPA1*02:01/DPB1*01:01 | 43.3 | 4.5 | ||||
HLA-DPA1*03:01/DPB1*04:02 | 46.9 | 5.45 | ||||
HLA-DQA1*03:01/DQB1*03:02 | 387.4 | 6.47 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 410.2 | 6.25 | ||||
HLA-DRB1*04:05 | 453.4 | 27.74 | ||||
SFFVWVIILFQKAFS | HLA-DPA1*01/DPB1*04:01 | 140.1 | 6.72 | |||
HLA-DPA1*03:01/DPB1*04:02 | 32.9 | 3.93 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 534.7 | 8.4 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 905 | 46.11 | ||||
HLA-DRB1*04:05 | 187.6 | 15.84 | ||||
HLA-DRB1*07:01 | 870.6 | 41.96 | ||||
SSFFVWVIILFQKAF | HLA-DPA1*01/DPB1*04:01 | 117 | 5.92 | |||
HLA-DPA1*03:01/DPB1*04:02 | 30.6 | 3.65 | ||||
HLA-DQA1*03:01/DQB1*03:02 | 180.6 | 2.35 | ||||
HLA-DQA1*04:01/DQB1*04:02 | 413.8 | 6.31 | ||||
HLA-DQA1*05:01/DQB1*03:01 | 804.3 | 43.98 | ||||
HLA-DRB1*04:05 | 327.5 | 22.96 | ||||
HLA-DRB1*07:01 | 740.6 | 39.32 |
Position of peptides is according to position of amino acid in the Envelope glycoprotein.
Table 4: List of top four epitopes that had binding affinity with the Class II alleles.
Epitope | Coverage World class I |
Coverage Sudan Class I |
Total HLA hits | Epitope (core sequence) | Coverage World Class II |
Coverage Sudan Class II |
Total HLA hits |
---|---|---|---|---|---|---|---|
AAGIAWIPY | 8.42% | 6.67% | 1 | AAGIAWIPY | 85.67% | 50.60% | 4 |
AEGVIAFLI | 11.13% | 2.35% | 2 | ADIGEWAFW | 76.04% | 46.62% | 2 |
AENCYNLEI | 18.29% | 3.80% | 3 | AEGVIAFLI | 97.78% | 75.00% | 9 |
ATSYLEYEI | 7.05% | 20.37% | 2 | AFFLYDRLA | 56.18% | 41.45% | 5 |
DAASSRITK | 5.83% | 6.14% | 1 | AGIAWIPYF | 83.57% | 60.80% | 4 |
DGAFFLYDR | 5.83% | 6.14% | 1 | AKPKETFLQ | 43.67% | 0.91% | 2 |
EPHDWTKNI | 10.31% | 18.71% | 1 | ALVCGLRQL | 27.48% | 19.42% | 2 |
ETFLQSPPI | 2.50% | 10.07% | 1 | ASTVIYRGV | 42.10% | 0.00% | 3 |
ETTQALQLF | 5.82% | 3.24% | 1 | DDNWWTGWR | 31.46% | 26.56% | 2 |
EVTEIDQLV | 2.50% | 10.07% | 1 | DFIDNPLPN | 27.48% | 19.42% | 2 |
FAEGVIAFL | 14.29% | 26.90% | 3 | DKFRKSSFF | 92.37% | 55.19% | 6 |
FFVWVIILF | 9.21% | 13.77% | 2 | ELRTYTILN | 60.83% | 26.88% | 5 |
FLFQLNDTI | 46.73% | 39.93% | 3 | ENTSSYYAT | 76.04% | 46.62%2 | 2 |
FLRATTELR | 5.83% | 6.14% | 1 | EVTEIDQLV | 55.49% | 33.36% | 4 |
*FLYDRLAST | 46.73% | 39.93% | 3 | EWAENCYNL | 76.04% | 46.62% | 2 |
FSMPLGVVT | 10.31% | 18.71% | 1 | EWAFWENKK | 35.07% | 9.27% | 2 |
GLMHNQNAL | 39.08% | 26.10% | 1 | *FAEGVIAFL | 99.67% | 97.24% | 21 |
GTGPCPGDY | 2.43% | 5.19% | 1 | FFLYDRLAS | 97.74% | 87.28% | 10 |
GVIAFLILA | 1.95% | 0.00% | 1 | FFVWVIILF | 90.63% | 68.64% | 9 |
GVRGFPRCR | 3.89% | 11.73% | 1 | FIDNPLPNQ | 27.48% | 19.42% | 2 |
HLASTDQLK | 5.83% | 6.14% | 1 | FLFQLNDTI | 98.84% | 88.42% | 19 |
HTPQFLFQL | 2.50% | 10.07% | 1 | FLILAKPKE | 96.20% | 66.42% | 13 |
IALLCVCKL | 10.31% | 18.71% | 1 | FLLRRWGGT | 78.80% | 46.62% | 4 |
IHDFIDNPL | 2.75% | 5.86% | 1 | FLQSPPIRE | 90.90% | 80.22% | 15 |
IIALLCVCK | 20.88% | 9.26% | 2 | *FLRATTELR | 99.69% | 97.36% | 21 |
*IIIAIIALL | 42.53% | 34.71% | 3 | *FLYDRLAST | 99.38% | 95.87% | 19 |
ILGSLGLRK | 16.81% | 8.81% | 1 | FRKSSFFVW | 98.46% | 88.10% | 16 |
KAIDFLLRR | 20.45% | 8.69% | 2 | *FVWVIILFQ | 99.72% | 95.94% | 18 |
KCNPNLHYW | 7.26% | 8.67% | 2 | FWENKKNLS | 41.91% | 23.70% | 3 |
KFRKSSFFV | 3.89% | 11.73% | 1 | GAFFLYDRL | 83.07% | 65.03% | 9 |
KINQIIHDF | 4.61% | 10.88% | 1 | GVIAFLILA | 27.48% | 19.42% | 2 |
KRWGFRSGV | 4.78% | 1.26% | 1 | HKDGAFFLY | 43.67% | 0.91% | 2 |
KSSFFVWVI | 11.94% | 20.99% | 3 | HNAAGIAWI | 81.77% | 43.45% | 8 |
LAKPKETFL | 10.31% | 18.71% | 1 | HTPQFLFQL | 80.85% | 56.06% | 8 |
LANETTQAL | 17.86% | 24.13% | 2 | IAIIALLCV | 78.56% | 47.62% | 6 |
LMHNQNALV | 39.08% | 26.10% | 1 | IALLCVCKL | 69.87% | 38.55% | 7 |
LQLPRDKFR | 5.36% | 5.56% | 1 | IAWIPYFGP | 76.04% | 46.62% | 2 |
*MHNQNALVC | 35.14% | 67.96% | 3 | IENFGAQHS | 74.98% | 46.20% | 6 |
NADIGEWAF | 8.42% | 6.67% | 1 | IGEWAFWEN | 35.07% | 9.27% | 2 |
NFAEGVIAF | 8.42% | 6.67% | 1 | IGITGIIIA | 74.96% | 40.71% | 6 |
NPNLHYWTA | 10.55% | 6.21% | 1 | IHDFIDNPL | 92.12% | 78.01% | 14 |
NQNALVCGL | 4.64% | 5.86% | 2 | IIAIIALLC | 27.48% | 19.42% | 2 |
QLRGEELSF | 8.44% | 1.04% | 1 | IIHDFIDNP | 75.68% | 69.37% | 7 |
RLASTVIYR | 40.03% | 22.68% | 4 | IIIAIIALL | 87.90% | 86.32% | 8 |
RPHTPQFLF | 12.78% | 3.60% | 1 | ILAKPKETF | 91.22% | 65.33% | 6 |
RRWGGTCRI | 4.78% | 1.26% | 1 | ILGSLGLRK | 95.36% | 82.55% | 16 |
*RTYTILNRK | 43.03% | 32.96% | 5 | ILNRKAIDF | 57.32% | 22.87% | 7 |
SATKRWGFR | 11.03% | 11.52% | 2 | INADIGEWA | 90.79% | 83.03% | 9 |
SSFFVWVII | 7.05% | 20.37% | 2 | INQIIHDFI | 87.75% | 76.88% | 11 |
SSYYATSYL | 6.81% | 14.72% | 2 | ITGIIIAII | 59.73% | 37.57% | 7 |
STDIPSATK | 15.53% | 3.22% | 1 | IYRGVNFAE | 97.85% | 79.39% | 15 |
TELRTYTIL | 11.13% | 2.35% | 2 | IYTEGLMHN | 84.50% | 57.57% | 8 |
TPENITTAV | 12.78% | 3.60% | 1 | KAIDFLLRR | 97.93% | 88.76% | 12 |
TQALQLFLR | 11.03% | 11.52% | 2 | KDGAFFLYD | 93.44% | 56.81% | 6 |
TSSYYATSY | 8.44% | 1.04% | 1 | KFRKSSFFV | 82.94% | 59.48% | 8 |
TTELRTYTI | 4.61% | 10.88% | 1 | KINQIIHDF | 35.07% | 9.27% | 2 |
TTPENITTA | 2.50% | 10.07% | 1 | KKNLSEQLR | 35.07% | 9.27% | 3 |
TVTGILGSL | 2.50% | 10.07% | 1 | KPKETFLQS | 35.07% | 9.27% | 2 |
VIAFLILAK | 30.92% | 11.89% | 2 | KSSFFVWVI | 56.97% | 45.17% | 5 |
VVTNSTLEV | 1.95% | 0.00% | 1 | LAKPKETFL | 46.90% | 22.87% | 3 |
WTKNITDKI | 2.50% | 10.07% | 1 | LANETTQAL | 94.36% | 66.27% | 9 |
YEIENFGAQ | 7.32% | 3.89% | 1 | LASTDQLKS | 35.12% | 32.24% | 3 |
YTENTSSYY | 30.96% | 34.82% | 4 | LASTVIYRG | 55.64% | 33.74% | 6 |
YTILNRKAI | 10.31% | 18.71% | 1 | LEVTEIDQL | 94.75% | 75.18% | 12 |
YYATSYLEY | 3.89% | 3.35% | 1 | LEYEIENFG | 31.32% | 33.96% | 2 |
Epitope set | 98.19% | 97.94% | LFLRATTEL | 96.07% | 80.79% | 9 | |
LHYWTAQEQ | 52.68% | 12.50% | 6 | ||||
LITSTVTGI | 72.55% | 51.10% | 9 | ||||
LKSVGLNLE | 71.70% | 31.28% | 9 | ||||
LLQLPRDKF | 62.71% | 45.47% | 5 | ||||
LNRKAIDFL | 76.85% | 42.90% | 6 | ||||
LQLPRDKFR | 43.67% | 0.91% | 3 | ||||
LRATTELRT | 93.26% | 72.21% | 10 | ||||
LRGEELSFE | 89.36% | 55.64% | 5 | ||||
LRTYTILNR | 90.57% | 61.95% | 12 | ||||
LVCGLRQLA | 52.81% | 21.85% | 4 | ||||
LYDRLASTV | 35.07% | 9.27% | 2 | ||||
NFAEGVIAF | 86.45% | 55.04% | 10 | ||||
NITTAVKTV | 42.10% | 0.00% | 3 | ||||
NLHYWTAQE | 47.36% | 34.83% | 5 | ||||
NQNALVCGL | 34.55% | 0.00% | 2 | ||||
NRKAIDFLL | 89.24% | 76.26% | 7 | ||||
NSTLEVTEI | 47.25% | 3.57% | 4 | ||||
NWWTGWRQW | 76.04% | 46.62% | 2 | ||||
QALQLFLRA | 89.03% | 53.33% | 4 | ||||
QFLFQLNDT | 54.34% | 27.71% | 4 | ||||
QIIHDFIDN | 27.48% | 19.42% | 2 | ||||
QLANETTQA | 34.55% | 0.00% | 2 | ||||
RKAIDFLLR | 97.74% | 87.28% | 10 | ||||
RKSSFFVWV | 63.19% | 34.70% | 4 | ||||
RLASTVIYR | 90.02% | 52.71% | 8 | ||||
SFFVWVIIL | 50.17% | 0.91% | 3 | ||||
SNGLITSTV | 78.05% | 43.64% | 6 | ||||
SSFFVWVII | 35.07% | 9.27% | 2 | ||||
STIGIRPSS | 10.54% | 15.91% | 1 | ||||
SYEAGEWAE | 80.26% | 48.53% | 4 | ||||
SYYATSYLE | 98.97% | 88.09% | 15 | ||||
TELRTYTIL | 89.03% | 53.33% | 4 | ||||
TLEVTEIDQ | 92.24% | 75.68% | 7 | ||||
TQALQLFLR | 79.70% | 28.48% | 9 | ||||
TSSYYATSY | 39.14% | 27.29% | 3 | ||||
TTQALQLFL | 87.65% | 76.26% | 6 | ||||
VCGLRQLAN | 82.82% | 47.95% | 9 | ||||
VIAFLILAK | 74.46% | 50.35% | 7 | ||||
VSYEAGEWA | 92.47% | 83.56% | 5 | ||||
VTNSTLEVT | 93.86% | 68.35% | 6 | ||||
VVTNSTLEV | 96.12% | 74.93% | 10 | ||||
VWVIILFQK | 84.96% | 60.53% | 6 | ||||
WAENCYNLE | 93.44% | 56.81% | 6 | ||||
WRQWIPAGI | 88.10% | 74.46% | 9 | ||||
WWTGWRQWI | 31.46% | 26.56% | 2 | ||||
YATSYLEYE | 98.95% | 93.65% | 15 | ||||
YDRLASTVI | 71.90% | 29.10% | 8 | ||||
YLEYEIENF | 99.31% | 94.14% | 16 | ||||
YRGVNFAEG | 96.66% | 85.29% | 10 | ||||
YYATSYLEY | 99.38% | 93.52% | 16 | ||||
Epitope set | 99.99% | 99.22% |
*Proposed epitopes.
Table 5: Population coverage of all epitopes in both MHC class I and II in Sudan and the world.
Epitope | Coverage World Class I |
Coverage Sudan Class I |
Total HLA hits | Epitope (core sequence) | Coverage World Class II |
Coverage Sudan Class II |
Total HLA hits |
---|---|---|---|---|---|---|---|
FLYDRLAST | 46.73% | 39.93% | 3 | FAEGVIAFL | 99.67% | 97.24% | 21 |
IIIAIIALL | 42.53% | 34.71% | 3 | FLRATTELR | 99.69% | 97.36% | 21 |
MHNQNALVC | 35.14% | 67.96% | 3 | FLYDRLAST | 99.38% | 95.87% | 19 |
RTYTILNRK | 43.03% | 32.96% | 5 | FVWVIILFQ | 99.72% | 95.94% | 18 |
Epitope set | 85.08% | 91.30% | Epitope set | 99.97% | 99.22% |
Table 6: Population coverage of proposed epitopes in both MHC class I and II in Sudan and the world.
Conservancy in GP protein in SUDV was found promising for peptide vaccine design. However, as limitations to the current study; the few numbers of SUDV glycoprotein variants that was available to use is minimizing the significance of this conservancy.
To determine a potential and effective peptide antigen for B cell, epitopes should get above threshold scores in Bepipred linear epitope, Emini surface accessibility and Kolaskar and Tongaonkar antigenicity prediction methods in IEDB. Epitopes illustrated in Table 2, are the only conserved regions from all retrieved strains of SUDV Spike glycoprotein that are available in NCBI database until 1st June 2016 and have high probability of activating humoral immune response. Epitope 114 KKPDGSECLPPPPDGVRG 131 is overlapping the three predicted tools as well as its last 9mers PPPPDGVRG indicating that this region is probably promising.
Since the immune response of T cell is long lasting response comparing with B cell, where the antigen can easily escape the antibody memory response [71] and considering that CD8+ T and CD4+ T cell responses play a major role in antiviral immunity [72], designing a vaccine against T cell epitope is much more promising. Among 65 conserved T cell epitopes predicted to interact with MHC Class I as shown in Table 3, epitope MHNQNALVC has succeeded to interact with only three MHC I alleles under the selected threshold. However, this epitope is very promising as it interacted with HLA-C*06:02 and HLA-C*07:01 that are very frequent among Sudanese population [73- 75]. As well as FLYDRLAST that had successfully predicted to bind with good affinity to HLA-A*02:01 - the world wide predominant MHC I allele which is capable of eliciting strong CTL responses. 246 RPHTPQFLF 254 is proposed by different in silico prediction studies, Interestingly this epitope in addition to TPENITTAV are the only epitopes that are successfully predicted to bind to HLA-B*07 - the allele concluded by Sanchez et al. [76] as inducing lifesaving robust cellular immune response among SUDV survivors.
MHC I epitope FLYDRLAST is showing high potentials to induce MHC II response as seen in Table 4, as it was found to successfully bind to several HLA-D, P and Q alleles indicating that further attention need to be targeted to this region. All proposed MHC I and MHC II epitopes as illustrated were better chosen to serve the best population coverage percentage as well as the lowest number of peptides to be used as multi epitope vaccine against the highly lethal Sudan Ebola Virus.
As the increase of incidence of viral infections by new lethal viruses and infection of human by viruses that earlier recognized as a zoonotic, the need of new available technology increases. Bioinformatics techniques cover this need and reduce the time and effort consumed in designing of new vaccines and therapies.
Sudan Ebola virus is life threatening infection which enforces the need of developing a protective vaccine. The fact that all Ebola species accompanied with high mortality rates increases the need of developing a vaccine against all filoviruses. Several epitopes proposed in this study especially FLYDRLAST which is suggested before by Srivastava et al. [77], to be a peptide vaccine against Ebola virus, could be a powerful multi epitope vaccine against SUDV after in vivo and in vitro verifications.