ISSN: 1745-7580
+44-77-2385-9429
Research Article - (2017) Volume 13, Issue 2
Merkel cell Polyomavirus is non-enveloped, dsDNA virus belonging to Polyomaviridae family linked to an uncommon aggressive skin malignancy. The poor prognosis and limited understanding of disease pathogenesis warrants innovative treatment. In this current study we aim to predict TB cell immunogenic epitopes from the VP1 protein of all merkel cell polyomavirus strain which will aid in effective epitope based vaccine design using immuoinformatics approaches. We retrieved 423 full-length VP1 protein sequences of merkel cell polyomaviruse virus species from the NCBI database. These sequences were analyzed to determine the conserved region and were used to predict the epitopes using the IEDB immunoinformatics algorithms. For B cell three epitope were predicted as peptide vaccine (QEKTVY, KTVYPK, and QEKTVYP). For T cell the predicted Class-I peptides (SLFSNLMPK, LQMWEAISV and LLVKGGVEV) were found to cover the maximum number of MHC I alleles. The highest scoring Class II MHC binding peptides were (IELYLNPRM, ISSLINVHY and INSLFSNLM). Further experiments will need to be undertaken to confirm the potential of these predicted epitopes in a future efficacious vaccine development.
<Keywords: Merkel cell polyomavirus (MCPYV), Epitope, Peptide vaccine, Immune epitope database IEDB
Merkel cell polyomavirus is a recently discovered small non enveloped circular double-stranded DNA virus etiologically linked to an uncommon but highly lethal form of skin malignancy, Merkel cell carcinoma (MCC). MCPYV belongs to the Orthopolyomavirus genus of the Polyomaviridae family, which include mammalian polyomaviruses such as murine PyV (MPyV), simian virus (SV40) and the human polyomaviruses JC (JCPyV) and BK (BKPyV) [1-11]. The genome is 5.4 kb which constitute two regions; the early region which encodes the large tumour (T) and small T antigens, and the late region which comprise the structural viral proteins VP1, VP2 and VP3, which form the viral capsid. [2-5], However, VP1 makes up more than 70% of the total protein content of virus particles and is also called the major structural protein which is responsible for immunogenic response inside the host body [4,10]. Antibodies against vp1 protein is likely to be expressed in 90% of MCC tumors [11], thus it represent an ideal therapeutic candidate for designing immunoprophylactic vaccine [10,12,13].
Merkel cell carcinoma is an aggressive lethal neuroectodermal malignancy arising from mechanoreceptor Merkel cells [3,6,11,14,15]. MCC was first described by Cyril Toker in 1972, who noted a colored painless solid nodule within five different areas of two older men, who later died as a result of this tumour, and three older women, yet the pathogenesis and etiology of MCC remains poorly understood [3,6]. MCC is rare, but its incidence has tripled over the past two decades in the United States to 1500 cases per year and 2,500 new cases diagnosed in the E.U [11,14,16]. Epidemiological studies revealed that older, lighter-skinned, and immunosuppressed individuals, such as those infected with HIV and/or diagnosed with AIDS are more susceptible to infection [1,14,17-19]. In 2008, a novel merkel cell polyomavirus was discovered and found to be integrated and associated with 80% of MCC tumors [1,6,20,21], thus it has been confirmed to be the etiological agent behind six other viruses now known to be either directly or indirectly causes human cancer [7,11,22].
Developing an advanced vaccine for MCC, that specifically targets the immunogenic proteins, is of vital significance to overcome the devastating disease [23]. In the previous study a DNA vaccine which encoding large and small T virus antigen was developed and has shown that it is possible to induce both CD4+ and CD8+ T lymphocyte response [24]. However an epitope based peptide vaccine could be another possible candidate. The aim of this present study is to predict a promiscuous epitopes that bind to B cell as well as both classes of MHC molecules with a maximum number of HLA molecules in a given set of population in MCPYV protein using an immunoinformatics approach which is a prerequisite in the development of an epitope based vaccine design.
Epitopes based subunit vaccines offer a much stronger and measured immune response as well as avoid the possible fatal consequences of employing entire viral proteins and peptides [23,25,26]. The poor prognosis of MCC patients as well as the limited understanding of disease pathogenesis warrants innovative treatments to control MCC [24]. This is the first study concerning Merkel cell polyomavirus vp1 protein vaccine design using immunoinformatics tools.
Protein sequence retrieval
A set of available 97 virulent strains of Merkel Cell Polyomavirus (MCPYV) from different geographic regions were retrieved from the NCBI database. (https://www.ncbi.nlm.nih.gov/protein/?term=Merkel +cell+polyomavirus+VP1) .
These sequences were retrieved in October 2016 and selected for immunobioinformtic analysis. These sequences were isolated from different geographical areas (USA, Japan, China, Germany, France and Lithuania) from 1995-2011. The retrieved VP1 97 strains with length of 423 a.a and their accession number and collection area are listed in Table 1.
Bepipred epitope | StartEnd | Length | Emini surface threshold(1.00) | Antigenicity threshold (1.031) | |
---|---|---|---|---|---|
MAPKRKASSTCKT | 1 | 13 | 13 | 1.294 | 0.995 |
KRQC | 15 | 18 | 4 | 1.217 | 1.057 |
GCCPN | 23 | 27 | 5 | 0.179 | 1.108 |
GEDSI | 48 | 52 | 5 | 0.679 | 0.951 |
VNSPDLPT | 65 | 72 | 8 | 0.802 | 1.04 |
DLQPKGSSPDQPIKENLP | 82 | 99 | 18 | 3.112 | 1.003 |
GAGIPVS | 153 | 159 | 7 | 0.151 | 1.06 |
EPL | 171 | 173 | 3 | 0.969 | 1.055 |
TTNGGPIT | 189 | 196 | 8 | 0.542 | 0.933 |
MTPKNQGLDPQAKAKLDKDGNYP | 205 | 227 | 23 | 4.957 | 0.972 |
PSKNENSRYYGSIQTGSQTP | 235 | 254 | 20 | 5.346 | 0.973 |
GVGPLC | 272 | 277 | 6 | 0.094 | 1.143 |
KVSGQPMEG | 338 | 346 | 9 | 0.71 | 0.981 |
DNQ | 348 | 350 | 3 | 2.04 | 0.886 |
LPG | 363 | 365 | 3 | 0.554 | 1.063 |
EGSE | 358 | 361 | 4 | 1.331 | 0.897 |
GQEKTVYPK * | 377 | 385 | 9 | 2.448 | 1.013 |
QEKTVYP* | 378 | 384 | 7 | 2.208 | 1.045 |
QEKTVY* | 378 | 383 | 6 | 1.912 | 1.042 |
EKTVYP | 379 | 384 | 6 | 1.707 | 1.05 |
EKTVYPK | 379 | 385 | 7 | 2.549 | 1.033 |
KTVYPK* | 380 | 385 | 6 | 1.971 | 1.063 |
SVAPA | 387 | 391 | 5 | 0.397 | 1.117 |
Table 1: List of B-cell epitopes predicted by different scales fromVP1 protein in Merkle cell Polyomavirus;*Peptide from 377 to 385 gives higher score in Kolaskar and Tongaonkar antigenicity if it is shorten to 7 amino acids (378 to 384) or to 6 amino acids (378 to 383) & (380 to 385).
Phylogenetic and alignment
The retrieved sequences were subjected to Phylogenetic and alignment study in order to determine the origin of each strain and the conservancy using different tools from (http://www.phylogeny. fr) [27]. The phylogenetic tree and alignment were presented in Figures 1 and 2.
Figure 1: Flowchart representing the immunoinformatics prediction of potential B and T lymphocyte epitopes for the development of peptide vaccine targeting MCPYV.
Conserved regions determination
BioEdit sequence alignment editor (v7.0.9) were used to align the retrieved sequences to obtain conserved regions with the aid of ClustalW (Hall, 1999) by comparing the whole length amino acid of 97 VP1 strains against MCPYV reference sequence under gene bank accession number YP_009111420.1. 100% of identical and similar amino acid sequences were selected as a conserved region [28].
Prediction of B-cell epitopes
As the Immunogenic B cell epitopes interacts with B-lymphocytes, the B-lymphocyte is differentiated into antibody-secreting plasma cell and memory cell. B cell epitope is characterized by being accessible and antigenic [29]. B cell epitopes were predicted using tools from immune epitope data base analysis resource (IEDB-AR) (http://tools. iedb.org/bcell/) by Bepipred linear epitope prediction analysis [30,31]. The reference sequence was subjected to Bepipred linear epitope prediction tool to predict the binding probability of specific regions in the protein to B cell receptor with a default threshold value of (.0393). The predicted epitopes were subjected to Bioedit tool and only 100% conserved epitopes were selected. Then IEDB tools were used to predict surface accessible epitopes by Emini surface accessibility prediction [32] and antigenicity by Kolaskar and Tongaonkar antigenicity method [33] with thresholds of 1.000 and 1.031 respectively.
Binding predictions for MHC class I
For prediction of peptides bind to MHC class I; the reference sequence was submitted in MHC-I Binding prediction tool http://tools.iedb.org/ mhci/n in IEDB. In MHC-I peptide complex presentation to T lymphocytes several steps are involved. The cellular attachment of cleaved peptides to MHC molecules step was predicted. Prediction methods include Artificial Neural Network (ANN), Stabilized Matrix Method (SMM), or Scoring Matrices derived from Combinatorial Peptide Libraries (Comblib_ Sidney2008), ANN method was used [34-38]. Epitopes lengths were set as 9 mers prior to prediction. The conserved epitopes which bind to alleles at score equal or less than 100 half-maximal inhibitory concentrations (IC50) were selected for further analysis [34].
Binding predictions for MHC class II
Peptide binding Analysis of MHC class II molecules was assessed by the IEDB MHCII prediction tool at http://tools.immuneepitope. org/mhcii/ [39,40]. Certain HLA-DR, HLA-DP, HLA-DQ alleles were analyzed. MHC class II groove has the ability to bind to peptides with different lengths. This binding variability makes the prediction difficult and less accurate [41]. MHC II binding prediction can be achieved using five different IEDB tools; SMM_align, NN- align, Compinatorial Libraries, Sturniolo’s method and NetMHCIIpan in addition to the consensus method. NN-align method was used to predict MHC class II epitopes [35]. All conserved epitopes that bind to many alleles at score equal or less than 1000 half-maximal inhibitory concentration (IC50) is selected for further analysis.
Population coverage calculation
All MHC I and MHC II potential binders from Merkel Cell Polyomavirus VP1 capsid protein were assessed for population coverage analysis against the whole world population with the selected MHC I and MHC II interacted alleles using IEDB population coverage calculation tool at http://tools.iedb.org/tools/population/iedb_input [42]. Population coverage calculation is based on total HLA hits score that is obtained from IEDB, these data derived from the relative frequency of an allele at a particular locus in a population.
Assessment of epitope allergenicity
For allergenicity prediction AllerTOP (http://www.pharmfac.net/ allertop) was used [43]. So the predicted B cell epitopes and epitopes bind to MHC I & II are subjected to AllerTOP giving result either “probable allergen” or “probable non-allergen”.
Homology modeling
Merkel Cell Polyomavirus VP1capsid protein 3D structure was obtained by RaptorX, (http/www.raptor.uch icago.edu) which uses advanced homology detection techniques to build protein 3D structures. UCSF Chimera (version 1.8) was used to visualize the 3D structure, Chimera currently available at the chimera web site (http://www.cgl.ucsf.edu/cimera). Further verification of the surface accessibility and hydrophilicity of predicted B lymphocyte epitopes was achieved, visualization of all predicted T cell epitopes in the structural level were also assessed [44,45].
Prediction of B-cell epitope
VP1 capsid protein was subjected to Bepipred linear epitope prediction that predicts linear epitope, Kolaskar and Tongaonkar antigenicity and Emini surface accessibility prediction methods in IEDB, Figure 1. In Bepipred Linear Epitope Prediction method; the average binders score of the protein to B cell was 0.393, with a maximum of 2.546 and a minimum of -1.464, all values equal or greater than the default threshold 0.393 were predicted to be a potential B cell binders.
In Emini surface accessibility prediction; the average surface accessibility areas of the protein was scored as 1.000, with a maximum of 5.749 and a minimum of 0.060, all values equal or greater than the default threshold 1.000 were potentially in the surface. The Kolasar and Tongaonkar antigenicity prediction; the average of the antigenicity was 1.033, with a maximum of 1.235 and minimum of 0.877; all values greater than 1.026 are potential antigenic determinants. The result of all conserved predicted B cell epitopes are shown in Table 1 and Figures 3-6.
Figure 4: Bepipred Linear Epitope Prediction; Yellow areas above the red line( threshold) are proposed to be a part of B cell epitopes. While green areas are not.
MHC class 1 binding prediction
The VP1capsid protein was subjected to IEDB MHC-1 binding prediction tool. 29 peptides were predicted to interact with different MHC class 1 alleles using artificial neural network (ANN) method. The peptide SLFSNLMPK from 330 to 338 had higher affinity to interact with 4 alleles (HLA-A*03:01, HLA-A*11:01, HLA-A*30:01 & HLA-A*68:01). The predicted epitopes with their corresponding MHC1 alleles are listed in the Table 2, Figures 7 and 8.
Peptide | Start | End | Length | Allele | ANN_ic50* | percentile rank | |
---|---|---|---|---|---|---|---|
APKRKASST | 2 | 10 | 9 | HLA-B*07:02 | 97.43 | 0.3 | |
ASVPKLLVK | 29 | 37 | 9 | HLA-A*11:01 | 21.32 | 0.3 | |
AYSVARVSL | 100 | 108 | 9 | HLA-C*14:02 | 54.49 | 0.2 | |
DSITQIELY | 50 | 58 | 9 | HLA-A*26:01 | 49.81 | 0.1 | |
DTLQMWEAI | 118 | 126 | 9 | HLA-A*32:01 | 48.77 | 0.3 | |
EAISVKTEV | 124 | 132 | 9 | HLA-A*68:02 | 5.32 | 0.2 | |
EVVGISSLI | 131 | 139 | 9 | HLA-A*26:01 | 31.33 | 0.1 | |
HLA-A*68:02 | 2.95 | 0.2 | |||||
FSNTLTTVL | 259 | 267 | 9 | HLA-B*39:01 | 55.09 | 0.3 | |
HLA-C*15:02 | 77.15 | 0.1 | |||||
GVNYHMFAI | 160 | 168 | 9 | HLA-A*02:06 | 59.59 | 0.7 | |
HLA-A*32:01 | 88.82 | 0.3 | |||||
ISSLINVHY | 135 | 143 | 9 | HLA-B*58:01 | 54.83 | 0.3 | |
ITCDTLQMW | 115 | 123 | 9 | HLA-B*57:01 | 73.45 | 0.3 | |
HLA-B*58:01 | 18.43 | 0.3 | |||||
KENLPAYSV | 95 | 103 | 9 | HLA-B*40:02 | 11.87 | 0.1 | |
HLA-B*44:02 | 77.75 | 0.1 | |||||
KRKASSTCK | 4 | 12 | 9 | HLA-A*30:01 | 27.32 | 0.4 | |
LLVKGGVEV* | 34 | 42 | 9 | HLA-A*02:01 | 83.35 | 0.5 | |
HLA-A*02:06 | 31.73 | 0.6 | |||||
LPRYFNVTL | 305 | 313 | 9 | HLA-B*07:02 | 5.97 | 0.1 | |
HLA-B*35:01 | 66.6 | 0.4 | |||||
LQMWEAISV* | 120 | 128 | 9 | HLA-A*02:01 | 21.99 | 0.4 | |
HLA-A*02:06 | 4.27 | 0.1 | |||||
HLA-B*39:01 | 74.99 | 0.3 | |||||
MPKVSGQPM | 336 | 344 | 9 | HLA-B*07:02 | 6.8 | 0.1 | |
HLA-B*08:01 | 59.3 | 0.2 | |||||
HLA-B*35:01 | 18.27 | 0.2 | |||||
NEDITCDTL | 112 | 120 | 9 | HLA-B*40:01 | 25.59 | 0.2 | |
NPYPVVNLI | 320 | 328 | 9 | HLA-B*51:01 | 68.27 | 0.1 | |
HLA-B*53:01 | 78.09 | 0.3 | |||||
NVHYWDMKR | 140 | 148 | 9 | HLA-A*31:01 | 93.09 | 0.5 | |
HLA-A*68:01 | 11.03 | 0.1 | |||||
QMWEAISVK | 121 | 129 | 9 | HLA-A*03:01 | 72.45 | 0.2 | |
RVHDYGAGI | 148 | 156 | 9 | HLA-A*30:01 | 28.75 | 0.4 | |
RYFNVTLRK | 307 | 315 | 9 | HLA-A*11:01 | 39.91 | 0.4 | |
HLA-A*30:01 | 11.58 | 0.2 | |||||
HLA-A*31:01 | 52.51 | 0.4 | |||||
RYYGSIQTG | 242 | 250 | 9 | HLA-C*14:02 | 77.25 | 0.3 | |
SKNENSRYY | 236 | 244 | 9 | HLA-C*06:02 | 90.48 | 0.1 | |
SLFSNLMPK* | 330 | 338 | 9 | HLA-A*03:01 | 8.36 | 0.1 | |
HLA-A*11:01 | 5.05 | 0.2 | |||||
HLA-A*30:01 | 51.63 | 0.5 | |||||
HLA-A*68:01 | 43.73 | 0.6 | |||||
SSLINVHYW | 136 | 144 | 9 | HLA-B*57:01 | 9.42 | 0.1 | |
HLA-B*58:01 | 4.82 | 0.1 | |||||
SVARVSLPM | 102 | 110 | 9 | HLA-A*68:02 | 52.81 | 0.7 | |
HLA-B*07:02 | 75.63 | 0.2 | |||||
HLA-B*15:01 | 53.52 | 0.2 | |||||
HLA-B*35:01 | 73.5 | 0.4 | |||||
TEVVGISSL | 130 | 138 | 9 | HLA-B*40:01 | 10.2 | 0.1 | |
HLA-B*40:02 | 46.89 | 0.2 |
Table 2: list of epitopes that had binding affinity to MHC Class I alleles;*Proposed epitopes. ANN_ic50*the half maximal inhibitory concentration (IC50) is a measure of the effectiveness for successful binding of peptide to MHC molecule by the Artificial Neural Network method.
MHC class II binding prediction
As in MHC I, the protein subjected to MHC- II binding prediction tool using NN-align method.156 predicted epitopes were found to interact with MHC II different alleles. The peptides that have higher affinity are listed below in the Table 3 and Figure 9.
Epitope | Start | End | Allele | Peptide | IC50 | Percentile Rank |
---|---|---|---|---|---|---|
IELYLNPRM | 55 | 63 | HLA-DPA1*02:01/DPB1*01:01 | TQIELYLNPRMGVNS | 512.2 | 31.11 |
IELYLNPRMGVNSPD | 722 | 36.84 | ||||
HLA-DQA1*01:01/DQB1*05:01 | ITQIELYLNPRMGVN | 188.6 | 4.11 | |||
SITQIELYLNPRMGV | 202 | 4.37 | ||||
TQIELYLNPRMGVNS | 220.7 | 4.72 | ||||
QIELYLNPRMGVNSP | 254.1 | 5.32 | ||||
DSITQIELYLNPRMG | 284.2 | 5.83 | ||||
EDSITQIELYLNPRM | 352.1 | 6.93 | ||||
IELYLNPRMGVNSPD | 501.6 | 9.06 | ||||
HLA-DRB1*01:01 | DSITQIELYLNPRMG | 154 | 36.59 | |||
EDSITQIELYLNPRM | 276.9 | 46.71 | ||||
HLA-DRB1*04:01 | SITQIELYLNPRMGV | 67.6 | 5.42 | |||
DSITQIELYLNPRMG | 73.9 | 5.97 | ||||
EDSITQIELYLNPRM | 75.6 | 6.11 | ||||
ITQIELYLNPRMGVN | 76.8 | 6.22 | ||||
TQIELYLNPRMGVNS | 160.6 | 12.51 | ||||
QIELYLNPRMGVNSP | 230.4 | 16.76 | ||||
IELYLNPRMGVNSPD | 361.6 | 23.28 | ||||
HLA-DRB1*04:05 | SITQIELYLNPRMGV | 52.7 | 5.15 | |||
EDSITQIELYLNPRM | 56.4 | 5.55 | ||||
ITQIELYLNPRMGVN | 56.9 | 5.6 | ||||
DSITQIELYLNPRMG | 59 | 5.82 | ||||
TQIELYLNPRMGVNS | 94.9 | 9.19 | ||||
QIELYLNPRMGVNSP | 258.6 | 19.74 | ||||
IELYLNPRMGVNSPD | 344.3 | 23.65 | ||||
HLA-DRB1*04:04 | QIELYLNPRMGVNSP | 507.2 | 34.95 | |||
HLA-DRB1*07:01 | SITQIELYLNPRMGV | 845.5 | 41.52 | |||
HLA-DRB1*09:01 | ITQIELYLNPRMGVN | 178.3 | 11.77 | |||
TQIELYLNPRMGVNS | 187.3 | 12.25 | ||||
SITQIELYLNPRMGV | 232.8 | 14.6 | ||||
DSITQIELYLNPRMG | 558 | 27.41 | ||||
EDSITQIELYLNPRM | 674.9 | 30.92 | ||||
HLA-DRB1*13:02 | DSITQIELYLNPRMG | 290.8 | 12.48 | |||
EDSITQIELYLNPRM | 296.5 | 12.63 | ||||
HLA-DRB1*15:01 | SITQIELYLNPRMGV | 12.5 | 0.66 | |||
ITQIELYLNPRMGVN | 12.6 | 0.67 | ||||
TQIELYLNPRMGVNS | 14.4 | 0.85 | ||||
EDSITQIELYLNPRM | 15.4 | 0.97 | ||||
DSITQIELYLNPRMG | 15.6 | 1 | ||||
QIELYLNPRMGVNSP | 18.8 | 1.37 | ||||
IELYLNPRMGVNSPD | 74.9 | 7.66 | ||||
HLA-DRB4*01:01 | ITQIELYLNPRMGVN | 304.7 | 19.69 | |||
SITQIELYLNPRMGV | 328.5 | 20.75 | ||||
DSITQIELYLNPRMG | 347.6 | 21.58 | ||||
TQIELYLNPRMGVNS | 373 | 22.64 | ||||
QIELYLNPRMGVNSP | 757.7 | 35.01 | ||||
IELYLNPRMGVNSPD | 847.4 | 37.22 | ||||
HLA-DRB5*01:01 | ITQIELYLNPRMGVN | 444.5 | 31.48 | |||
SITQIELYLNPRMGV | 548.9 | 34.33 | ||||
DSITQIELYLNPRMG | 803 | 39.84 | ||||
ISSLINVHY | 135 | 143 | HLA-DRB1*01:01 | VVGISSLINVHYWDM | 36.6 | 17.63 |
TEVVGISSLINVHYW | 50.5 | 21.37 | ||||
EVVGISSLINVHYWD | 61.2 | 23.74 | ||||
VGISSLINVHYWDMK | 86.3 | 28.17 | ||||
GISSLINVHYWDMKR | 165.3 | 37.72 | ||||
HLA-DRB1*04:01 | VVGISSLINVHYWDM | 91.8 | 7.45 | |||
EVVGISSLINVHYWD | 102 | 8.27 | ||||
TEVVGISSLINVHYW | 104.1 | 8.44 | ||||
KTEVVGISSLINVHY | 108.9 | 8.81 | ||||
VGISSLINVHYWDMK | 144.7 | 11.42 | ||||
GISSLINVHYWDMKR | 223.4 | 16.36 | ||||
ISSLINVHYWDMKRV | 448.1 | 26.85 | ||||
HLA-DRB1*04:05 | TEVVGISSLINVHYW | 161 | 14.13 | |||
EVVGISSLINVHYWD | 178.7 | 15.28 | ||||
VVGISSLINVHYWDM | 179.3 | 15.32 | ||||
KTEVVGISSLINVHY | 190.4 | 16.01 | ||||
VGISSLINVHYWDMK | 226.5 | 18.05 | ||||
GISSLINVHYWDMKR | 354.6 | 24.09 | ||||
ISSLINVHYWDMKRV | 717.1 | 35.32 | ||||
HLA-DRB1*04:04 | GISSLINVHYWDMKR | 109.6 | 12.87 | |||
ISSLINVHYWDMKRV | 298.3 | 26.2 | ||||
HLA-DRB1*07:01 | KTEVVGISSLINVHY | 55.7 | 9.44 | |||
TEVVGISSLINVHYW | 75.8 | 11.77 | ||||
EVVGISSLINVHYWD | 124.7 | 16.31 | ||||
HLA-DRB1*08:02 | VVGISSLINVHYWDM | 549.4 | 13.36 | |||
HLA-DRB1*09:01 | VVGISSLINVHYWDM | 647 | 30.09 | |||
EVVGISSLINVHYWD | 682.2 | 31.13 | ||||
KTEVVGISSLINVHY | 699.1 | 31.57 | ||||
TEVVGISSLINVHYW | 753.9 | 33.05 | ||||
VGISSLINVHYWDMK | 774.1 | 33.53 | ||||
HLA-DRB1*11:01 | VVGISSLINVHYWDM | 77.7 | 11.7 | |||
VGISSLINVHYWDMK | 123 | 15.55 | ||||
EVVGISSLINVHYWD | 139.6 | 16.72 | ||||
TEVVGISSLINVHYW | 206.7 | 20.64 | ||||
GISSLINVHYWDMKR | 208 | 20.71 | ||||
ISSLINVHYWDMKRV | 282 | 24.04 | ||||
KTEVVGISSLINVHY | 328.4 | 25.82 | ||||
HLA-DRB1*15:01 | GISSLINVHYWDMKR | 154.7 | 14.1 | |||
KTEVVGISSLINVHY | 178 | 15.63 | ||||
TEVVGISSLINVHYW | 182.7 | 15.93 | ||||
EVVGISSLINVHYWD | 188.3 | 16.28 | ||||
VGISSLINVHYWDMK | 214.9 | 17.78 | ||||
VVGISSLINVHYWDM | 222 | 18.16 | ||||
HLA-DRB4*01:01 | TEVVGISSLINVHYW | 198.6 | 14.23 | |||
KTEVVGISSLINVHY | 203.3 | 14.49 | ||||
EVVGISSLINVHYWD | 352.5 | 21.78 | ||||
HLA-DRB5*01:01 | VVGISSLINVHYWDM | 342.9 | 28.19 | |||
GISSLINVHYWDMKR | 370 | 29.12 | ||||
VGISSLINVHYWDMK | 389.5 | 29.77 | ||||
EVVGISSLINVHYWD | 477.6 | 32.44 | ||||
TEVVGISSLINVHYW | 496.9 | 32.98 | ||||
INSLFSNLM | 328 | 336 | HLA-DPA1*02:01/DPB1*05:01 | LINSLFSNLMPKVSG | 986.1 | 17.6 |
HLA-DQA1*05:01/DQB1*02:01 | PVVNLINSLFSNLMP | 605.8 | 13.35 | |||
VNLINSLFSNLMPKV | 684.9 | 14.9 | ||||
VVNLINSLFSNLMPK | 687 | 14.94 | ||||
YPVVNLINSLFSNLM | 803.7 | 17.1 | ||||
HLA-DQA1*05:01/DQB1*02:01 | NLINSLFSNLMPKVS | 985.7 | 20.22 | |||
HLA-DRB1*01:01 | VNLINSLFSNLMPKV | 10.1 | 5.27 | |||
NLINSLFSNLMPKVS | 10.8 | 5.8 | ||||
YPVVNLINSLFSNLM | 11.1 | 6.03 | ||||
VVNLINSLFSNLMPK | 12.2 | 6.81 | ||||
LINSLFSNLMPKVSG | 12.5 | 7.01 | ||||
PVVNLINSLFSNLMP | 15.1 | 8.66 | ||||
HLA-DRB1*04:01 | VNLINSLFSNLMPKV | 12.6 | 0.37 | |||
VVNLINSLFSNLMPK | 15.1 | 0.56 | ||||
NLINSLFSNLMPKVS | 15.7 | 0.6 | ||||
PVVNLINSLFSNLMP | 20.8 | 1.04 | ||||
LINSLFSNLMPKVSG | 21.2 | 1.08 | ||||
YPVVNLINSLFSNLM | 26.8 | 1.6 | ||||
INSLFSNLMPKVSGQ | 37.8 | 2.64 | ||||
HLA-DRB1*04:05 | YPVVNLINSLFSNLM | 13 | 0.53 | |||
PVVNLINSLFSNLMP | 17.1 | 0.95 | ||||
VVNLINSLFSNLMPK | 19.1 | 1.18 | ||||
VNLINSLFSNLMPKV | 19.7 | 1.25 | ||||
NLINSLFSNLMPKVS | 27.3 | 2.16 | ||||
LINSLFSNLMPKVSG | 45.3 | 4.33 | ||||
INSLFSNLMPKVSGQ | 72 | 7.09 | ||||
HLA-DRB1*07:01 | YPVVNLINSLFSNLM | 11.4 | 1.98 | |||
PVVNLINSLFSNLMP | 16 | 3.01 | ||||
VVNLINSLFSNLMPK | 23.4 | 4.52 | ||||
VNLINSLFSNLMPKV | 26.9 | 5.16 | ||||
NLINSLFSNLMPKVS | 41.7 | 7.47 | ||||
LINSLFSNLMPKVSG | 62.6 | 10.25 | ||||
INSLFSNLMPKVSGQ | 102.1 | 14.38 | ||||
HLA-DRB1*09:01 | NLINSLFSNLMPKVS | 68.6 | 4.65 | |||
LINSLFSNLMPKVSG | 78.9 | 5.43 | ||||
VNLINSLFSNLMPKV | 79.8 | 5.5 | ||||
VVNLINSLFSNLMPK | 104.3 | 7.19 | ||||
PVVNLINSLFSNLMP | 148 | 9.98 | ||||
YPVVNLINSLFSNLM | 159.9 | 10.66 | ||||
HLA-DRB1*11:01 | VNLINSLFSNLMPKV | 139.4 | 16.7 | |||
VVNLINSLFSNLMPK | 256.6 | 22.99 | ||||
HLA-DRB1*15:01 | PVVNLINSLFSNLMP | 164.5 | 14.76 | |||
VVNLINSLFSNLMPK | 177.2 | 15.59 | ||||
HLA-DRB4*01:01 | YPVVNLINSLFSNLM | 72.5 | 5.63 | |||
PVVNLINSLFSNLMP | 73.2 | 5.69 | ||||
VVNLINSLFSNLMPK | 83.5 | 6.51 | ||||
VNLINSLFSNLMPKV | 92.3 | 7.19 | ||||
NLINSLFSNLMPKVS | 149.5 | 11.22 | ||||
LINSLFSNLMPKVSG | 179.5 | 13.11 | ||||
INSLFSNLMPKVSGQ | 203.6 | 14.51 | ||||
HLA-DRB5*01:01 | VNLINSLFSNLMPKV | 12.4 | 3.03 | |||
VVNLINSLFSNLMPK | 16.5 | 4.08 | ||||
PVVNLINSLFSNLMP | 21.8 | 5.28 | ||||
NLINSLFSNLMPKVS | 21.8 | 5.28 | ||||
LINSLFSNLMPKVSG | 36 | 7.92 | ||||
INSLFSNLMPKVSGQ | 63.6 | 11.69 |
Table 3: List of the proposed epitopes that had binding affinity to MHC Class II alleles.
Population coverage analysis
Epitopes that are predicted to interact with MHC-I and II alleles were selected for population coverage analysis. The results of population coverage of all epitopes that bind to MHC I & II in the world are listed in (Tables 4 and 5) respectively. The proposed epitopes with their coverage results are shown in Table 6.
Epitope | Coverageclass I | Total HLA hits |
---|---|---|
APKRKASST | 12.78% | 1 |
ASVPKLLVK | 15.53% | 1 |
AYSVARVSL | 3.04% | 1 |
DSITQIELY | 5.82% | 1 |
DTLQMWEAI | 4.61% | 1 |
EAISVKTEV | 2.50% | 1 |
EVVGISSLI | 8.25% | 2 |
FSNTLTTVL | 7.04% | 2 |
GVNYHMFAI | 6.51% | 2 |
ISSLINVHY | 3.42% | 1 |
ITCDTLQMW | 7.26% | 2 |
KENLPAYSV | 10.93% | 2 |
KRKASSTCK | 3.89% | 1 |
LLVKGGVEV | 40.60% | 2 |
LPRYFNVTL | 20.62% | 2 |
LQMWEAISV | 42.23% | 3 |
MPKVSGQPM | 29.99% | 3 |
NEDITCDTL | 7.81% | 1 |
NPYPVVNLI | 9.87% | 2 |
NVHYWDMKR | 11.03% | 2 |
QMWEAISVK | 16.81% | 1 |
RVHDYGAGI | 3.89% | 1 |
RYFNVTLRK | 23.91% | 3 |
RYYGSIQTG | 3.04% | 1 |
SKNENSRYY | 15.52% | 1 |
SLFSNLMPK | 38.86% | 4 |
SSLINVHYW | 7.26% | 2 |
SVARVSLPM | 29.93% | 4 |
TEVVGISSL | 11.13% | 2 |
Epitope set | 94.16% |
Table 4: Population coverage of all epitopes in MHC class I.
Epitope | Coverage class II | Total HLA hits | Epitope | Coverage class II | Total HLA hits |
---|---|---|---|---|---|
KRKASSTCK | 0.00% | 1 | GAGIPVSGV | 0.00% | 2 |
VPKLLVKGG | 10.54% | 2 | AGIPVSGVN | 0.00% | 1 |
PKLLVKGGV | 0.00% | 1 | IPVSGVNYH | 18.41% | 2 |
KLLVKGGVE | 18.23% | 1 | PVSGVNYHM | 17.82% | 2 |
LLVKGGVEV | 28.79% | 3 | VSGVNYHMF | 18.23% | 3 |
LVKGGVEVL | 34.26% | 7 | SGVNYHMFA | 27.90% | 3 |
VKGGVEVLS | 6.40% | 3 | VNYHMFAIG | 0.00% | 1 |
KGGVEVLSV | 0.00% | 1 | NYHMFAIGG | 4.77% | 2 |
GGVEVLSVV | 0.00% | 1 | YHMFAIGGE | 24.10% | 4 |
GVEVLSVVT | 18.23% | 1 | HMFAIGGEP | 9.32% | 5 |
VEVLSVVTG | 18.15% | 4 | MFAIGGEPL | 0.00% | 1 |
EVLSVVTGE | 0.00% | 2 | FAIGGEPLD | 56.92% | 10 |
VLSVVTGED | 3.02% | 1 | AIGGEPLDL | 0.00% | 1 |
LSVVTGEDS | 18.23% | 3 | IGGEPLDLQ | 0.00% | 3 |
VVTGEDSIT | 0.00% | 1 | GEPLDLQGL | 0.00% | 2 |
VTGEDSITQ | 27.97% | 2 | PLDLQGLVL | 0.00% | 2 |
GEDSITQIE | 0.00% | 1 | QGLVLDYQT | 11.53% | 1 |
EDSITQIEL | 0.00% | 2 | TTNGGPITI | 0.00% | 1 |
DSITQIELY | 0.00% | 4 | TNGGPITIE | 0.00% | 2 |
SITQIELYL | 28.79% | 5 | LGRKMTPKN | 4.77% | 1 |
ITQIELYLN | 20.57% | 4 | GRKMTPKNQ | 21.43% | 2 |
TQIELYLNP | 4.77% | 1 | NQGLDPQAK | 11.53% | 1 |
IELYLNPRM* | 65.84% | 12 | LDPQAKAKL | 18.41% | 1 |
WYTYTYDLQ | 22.06% | 6 | NSRYYGSIQ | 18.41% | 1 |
YTYTYDLQP | 18.47% | 5 | SRYYGSIQT | 4.77% | 1 |
YTYDLQPKG | 26.80% | 5 | RYYGSIQTG | 0.00% | 1 |
TYDLQPKGS | 11.53% | 1 | YYGSIQTGS | 29.38% | 6 |
YDLQPKGSS | 10.54% | 1 | YGSIQTGSQ | 22.06% | 3 |
PKGSSPDQP | 0.00% | 1 | VLQFSNTLT | 59.12% | 11 |
KGSSPDQPI | 18.23% | 1 | LQFSNTLTT | 42.33% | 10 |
IKENLPAYS | 57.34% | 10 | QFSNTLTTV | 0.00% | 1 |
KENLPAYSV | 11.53% | 1 | FSNTLTTVL | 50.51% | 11 |
NLPAYSVAR | 18.23% | 2 | SNTLTTVLL | 11.53% | 1 |
LPAYSVARV | 30.29% | 6 | NTLTTVLLD | 0.00% | 2 |
PAYSVARVS | 0.00% | 1 | TLTTVLLDE | 0.00% | 5 |
AYSVARVSL | 27.73% | 5 | LTTVLLDEN | 7.71% | 4 |
YSVARVSLP | 28.85% | 5 | TVLLDENGV | 11.53% | 1 |
SVARVSLPM | 34.78% | 2 | VLLDENGVG | 27.97% | 3 |
VARVSLPML | 19.66% | 8 | LDENGVGPL | 6.69% | 1 |
ARVSLPMLN | 10.54% | 4 | ENGVGPLCK | 0.00% | 2 |
RVSLPMLNE | 3.02% | 1 | LCKGDGLFI | 61.09% | 6 |
VSLPMLNED | 4.77% | 2 | CKGDGLFIS | 0.00% | 1 |
SLPMLNEDI | 14.37% | 4 | GDGLFISCA | 0.00% | 2 |
LPMLNEDIT | 4.77% | 2 | IVGFLFKTS | 17.84% | 6 |
LNEDITCDT | 17.84% | 2 | VGFLFKTSG | 36.31% | 6 |
EDITCDTLQ | 0.00% | 2 | GFLFKTSGK | 11.21% | 2 |
DITCDTLQM | 0.00% | 3 | ALHGLPRYF | 15.05% | 4 |
ITCDTLQMW | 27.97% | 8 | LHGLPRYFN | 24.10% | 3 |
TCDTLQMWE | 0.00% | 1 | HGLPRYFNV | 11.53% | 1 |
CDTLQMWEA | 11.53% | 2 | LPRYFNVTL | 43.71% | 8 |
DTLQMWEAI | 0.00% | 2 | PRYFNVTLR | 4.77% | 2 |
TLQMWEAIS | 7.04% | 3 | RYFNVTLRK | 0.00% | 2 |
LQMWEAISV | 52.19% | 8 | WVKNPYPVV | 49.39% | 7 |
MWEAISVKT | 43.78% | 4 | KNPYPVVNL | 20.95% | 2 |
WEAISVKTE | 16.52% | 7 | PYPVVNLIN | 3.02% | 1 |
AISVKTEVV | 33.10% | 5 | YPVVNLINS | 7.04% | 3 |
SVKTEVVGI | 34.26% | 3 | PVVNLINSL | 11.30% | 2 |
VKTEVVGIS | 2.33% | 2 | VVNLINSLF | 41.67% | 8 |
TEVVGISSL | 0.00% | 3 | VNLINSLFS | 38.62% | 7 |
EVVGISSLI | 44.03% | 4 | NLINSLFSN | 0.00% | 4 |
VVGISSLIN | 45.82% | 10 | LINSLFSNL | 35.36% | 9 |
VGISSLINV | 11.30% | 4 | INSLFSNLM* | 65.37% | 11 |
GISSLINVH | 0.00% | 1 | NSLFSNLMP | 4.77% | 2 |
ISSLINVHY* | 69.46% | 11 | SLFSNLMPK | 0.00% | 1 |
SSLINVHYW | 0.00% | 1 | LFSNLMPKV | 41.13% | 10 |
SLINVHYWD | 0.00% | 2 | FSNLMPKVS | 25.65% | 4 |
LINVHYWDM | 34.26% | 8 | NLMPKVSGQ | 0.00% | 1 |
INVHYWDMK | 4.77% | 4 | LMPKVSGQP | 0.00% | 1 |
NVHYWDMKR | 18.41% | 2 | MPKVSGQPM | 13.72% | 3 |
VHYWDMKRV | 28.79% | 5 | KVSGQPMEG | 0.00% | 1 |
HYWDMKRVH | 10.54% | 1 | EEVRIYEGS | 0.00% | 1 |
YWDMKRVHD | 4.77% | 2 | PDIVRFLDK | 0.00% | 1 |
WDMKRVHDY | 11.53% | 1 | IVRFLDKFG | 20.57% | 6 |
RVHDYGAGI | 29.99% | 4 | VRFLDKFGQ | 10.54% | 3 |
VHDYGAGIP | 0.00% | 1 | RFLDKFGQE | 0.00% | 2 |
HDYGAGIPV | 44.03% | 4 | FLDKFGQEK | 11.53% | 2 |
DYGAGIPVS | 0.00% | 4 | LDKFGQEKT | 18.41% | 1 |
YGAGIPVSG | 27.70% | 4 | FGQEKTVYP | 26.27% | 3 |
Epitope set | 81.94% |
Table 5: Population coverage of all epitopes in MHC class II.
Epitope | Coverage Class I | Total HLA hits | Epitope | Coverage Class II | Total HLA hits |
---|---|---|---|---|---|
LLVKGGVEV | 40.60% | 2 | IELYLNPRM | 65.84% | 12 |
LQMWEAISV | 42.23% | 3 | ISSLINVHY | 69.46% | 11 |
SLFSNLMPK | 38.86% | 4 | INSLFSNLM | 65.37% | 11 |
Epitope set | 70.30% | Epitope set | 73.11% |
Table 6: Population coverage of proposed epitopes for both MHC class I and II in the world.
Allergenicity test
The proposed B cell epitopes & those bind with different set of MHC I and II alleles were subjected to AllerTOP 2.0 software to avoid production of IgE antibodies as possible. The results are listed in Table 7.
B cell epitopes | Result | MHC class I epitopes | Result | MHC class II epitopes | Result |
---|---|---|---|---|---|
QEKTVYP | probable allergen | LLVKGGVEV | Probable allergen | IELYLNPM | Probable Non allergen |
KTVYPK | probable allergen | LQMWEAISV | Probable allergen | ISSLINVHY | Probable allergen |
QEKTVY | probable non-allergen | SLFSNLMPK | Probable Non-allergen | INSLFSNM | Probable allergen |
Table 7: Result of Allergenicity Test of predicted B cell and MHC class I & II epitopes.
In the current study we have successfully predicted a promiscuous epitopes for designing subunit based vaccine. The immune system appears to be playing a critical role in MCC biology with increasing evidence of virus-specific cellular and humoral immune responses that influence the prognosis of MCC patients. Newer strategies are currently being used to treat cancer, among these peptide vaccines which serve as a promising anticancer candidates as they target tumor cell and induce specific T cell response to tumor cell [1,13,46-48].
To best of our knowledge there is no effective approved vaccine against this virus, however previously a DNA vaccine encoding large or small T antigen as well as VP1 virus like particles have been developed. These type of vaccine have been shown to possess a protective specific CD4+\CD8+ T cell response in vaccinated mice, despite that subunit vaccine production which target a specific immunogenic protein would be helpful in generating adequate immune response inside the host body. Furthermore, a murine model expressing tumor cell line from B 16 mouse melanoma was created by Gomez et al. would be useful in clinical setting to address the efficacy of our predicted vaccine [17,24,49-51].
Peptide Vaccination produces profound and long lasting modifications in the adaptive immune system comprising T and B cells. Peptide vaccines are intrinsically safer than alternative vaccine formulations. Moreover, they will allow focusing solely on relevant epitopes, avoiding those that lead to non-protective responses [48]. Currently, there is an increasing interest in developing vaccines based on synthetic peptides. Peptide vaccines under various phases of trial and development, the vast majority of them related to cancers. [52-64].
In the present study we choose our predicted epitopes to be effective peptide antigens for both B and T cells. We selected 100% conserved sequence identity to VP1 major capsid protein. Several studies revealed its ability to induce potential immune response in MCC positive tumors [11,13].
In our case we choose our predicted B cell epitopes to be potential and and strong immunogenic peptide antigens for B cell, the length of the predicted epitopes ranged from 3 to 23 amino acids. According to Linear B cell epitope prediction tool available from IEDB these epitopes were found to be above the threshold scores in Bepipred linear epitope prediction, Emini surface accessibility, Kolaskar and Tongaonkar antigenicity, were analyzed based on methods of the IEDB. Epitopes illustrated in Table 1, are the only conserved regions among all retrieved strains of MCPYV vp1 protein that have been reported in NCBI database until 20th October 2016 and have high probability of activating humoral immune response. However, epitope QEKTVYP* from 377 to 384 was found to have the highest score, followed by KTVYPK from 380 to 385 and QEKTVY from 379 to 383 as summarized in Table 1. These findings indicated that these epitopes are surface accessible and antigenic.
Studies have shown T-cells to be important mediators of MCPyV-specific immune Surveillance thus, T cell epitope prediction was performed based on the probability of MHC-peptide ligand formation and presentation to different T cell populations [65,66]. In the development of universal vaccine, capable of inducing adequate immune response against all circulating strains, alleles binding affinity and accurate characterization of population coverage are highly recommended. A total of 29 conserved peptides in MHC class 1 were selected to bind to multiple HLA alleles as shown in Table 2, among these LLVKGGVEV, LQMWEAISV and SLFSNLMPK have high binding affinity as well as high percentage coverage (HLA-A*02:01, HLA-A*02:06), (HLA-A*02:01, HLA-A*02:06 and HLA-B*39:01), (HLA-A*03:01, HLA-A*11:01, HLA-A*30:01 and HLA-A*68:01) respectively. While the highest scoring MHC class II were (IELYLNPRM), (ISSLINVHY) and (INSLFSNLM) as shown in Table 3. Moreover, epitope LLVKGGVEV and LQMWEAISV has successfully predicted to interact with HLA-A*02:01 the most prevalent major histocompatibility complex (MHC) class I allele family in humans, presenting at high frequencies in all ethnic populations. Interestingly, MHC I epitope SLFSNLMPK has succeded to elicit MHC II response as seen in Table 3. These epitopes were found to successfully bind to several HLA-D, P, and Q alleles indicating that further attention need to be targeted to this region. Furthermore 225NYPIEVWCPDPSK237 and 245GSIKTGSQTPTVL257 were suggested before by Iyer et al. [67]. The later (GSIQTGSQ) has successfully interacted with HLADRB1* 01:01 which provides instructions for making a protein that plays a critical role in the immune system.
Allergic reactions are triggered when allergens cross-link preformed IgE bound to the high-affinity receptor FcεRI on mast cells. So mast cells act as alert the immune system to local infection [68]. Responses to allergens in humans are very heterogeneous and involve recognition of a large number of epitopes [69]. Thus; we subjected predicted B and cells to allergenicity test, among the 9 predicted epitopes it was concluded that two of them (SLFSNLMPK and IELYLNPRM) have the potential to be real epitopes, in MHC1 and MHCII respectively as their probable non allergic effect. On the other hand epitopes that are predicted to activate B cell KTVYPK and QEKTVYP were found to have low potential to be a real epitope, as their probable allergic effect which needs further experimental investigation.
The increasing incidence of human viral infections warrants the design of innovative treatment. With the recent advances in the field of bioinformatics, newer strategies are being devised to control and fight infectious diseases [70].
Merkel cell carcinoma is an aggressive devastating disease that warrants the need of developing effective protective vaccine. Several epitopes were proposed in this study especially SLFSNLMPK that successfully bind with high affinity to both MHC classes. In addition to (GSIQTGSQ) that is suggested before by lyer et al. as adoptive immunotherapy [67]. Further in vitro and in vivo studies will need to be undertaken in order to confirm the effectiveness of these predicted epitopes as peptide vaccine.
We would like to thank the member of waves for medical research and training center.