ISSN: 0974-276X
Case Report - (2011) Volume 4, Issue 7
Predicting the antigenic regions of a protein is of prime importance in assessing the states of a polypeptide chain as exposed or buried regions. This can be achieved by calculating and plotting the hydrophilicity of a protein using values of Hoop-Woods scale. Hence, in this paper, we report a hydrophilicity plot of Hoop-Woods scale amino acid sequence of a protein on its x-axis, and degree of hydrophobicity or hydrophilicity on its y-axis using python language as architecture by utilizing various functional attributes such as scipy, matplot and numpy modules. A case study has been reported using siallidase-4 enzyme. The program code can be available on request.
Keywords: Python, Hydropathy, Hydrophobicity, Hydrophilicity.
It has been reported in literature that prediction of antigenic regions in a protein will be helpful for a rational approach to the synthesis of peptides which may elicit antibodies reactive with the intact protein [1]. Various methods have been devised to locate hydrophilic regions in a protein [2] and it was earlier reported that antigenic determinants are surface located and often contain charged and polar residues. However, not all antigenic regions are hydrophilic and not all hydrophilic regions are antigenic [3]. A hydropathy plot is a quantitative analysis of the degree of hydrophobicity or hydrophilicity of amino acids of a protein. It is generally used to assess the possible exposed or buried domains of a protein and values are assigned from Hopp-Woods scale. The plot has amino acid sequence of a protein on its x-axis, and degree of hydrophobicity or hydrophilicity on its y-axis. Here, in this paper we report the hydropathy plot developed using object-oriented python language as architecture and utilized various functional attributes related to python such as scipy, matplot and numpy modules [4,5].
The hydrophilicity plot is developed as an object-oriented python program from which the users can submit the fasta format of the protein sequence and validations have been written to address any errors in the input. The program was presented in two aspects, first to read a given fasta file using read_fasta.py program and the second, hydrophilicity_plot.py
In general, hydrophobicity or hydrophilicity plots are designed to find out the polar and apolar residues of a given protein sequence. However, certain residues that span through the membrane are highly hydrophobic in nature and few other residues exposed on the surface of proteins are hydrophilic. The most commonly used hydrophilicity scale was utilized in the program, given in Table 1.
Amino acids | Hopp-Woods |
---|---|
Alanine | -0.5 |
Arginine | 3 |
Asparagine | 0.2 |
Aspartic acid | 3 |
Cysteine | -1 |
Glutamine | 0.2 |
Glutamic acid | 3 |
Glycine | 0 |
Histidine | -0.5 |
Isoleucine | -1.8 |
Leucine | -1.8 |
Lysine | 3 |
Methionine | -1.3 |
Phenylalanine | -2.5 |
Proline | 0 |
Serine | 0.3 |
Threonine | -0.4 |
Tryptophan | -3.4 |
Tyrosine | -2.3 |
Valine | -1.5 |
Table 1: Hydrophilicity scales of Hopp-Woods.
Hopp-Woods scale, an essential hydrophilic index was developed to predict potential antigenic sites of globular proteins rich in charged and apolar residues. A negative value is assigned for apolar residues. The program calculates each residue value from the input fasta file and a moving window size of 6 was employed by default. Values greater than 0 represents hydropathic region whereas negative values represent hydrophobic regions.
Case study
A case study has been reported here to assess the ability of the program to detect hydrophilic and hydrophobic regions of a protein by Hopp-Woods scale. Sialidase-4 protein sequence from our study, where we earlier reported possible viral sequence inserts in human genome [6] was selected in the study. The hydrophilicity values calculated from Hopp-woods scale for sialidase-4 enzyme sequence as input is given below and plot in Figure 1. Predicted hydrophilicity values are given below for the entire length of sequence.
>tr|Q3KR05|Q3KR05_HUMAN Sialidase 4 OS=Homo sapiens GN=NEU4 PE=2 SV=1
MMSSAAFPRWLSMGVPRTPSRTVLFERERTGLTYRVPSLLPVPPGPTLLAFVEQRLSPDD
SHAHRLVLRRGTLAGGSVRWGALHVLGTAALAEHRSMNPCPVHDAGTGTVFLFFIAVLGHTPEAVQI
ATGRNAARLCCVASRDAGLSWGSARDLTEEAIGGAVQDWATFAVGPGHGVQLPSGRLLVPAYTYRV
DRRECFGKICRTSPHSFAFYSDDHGRTWRCGGLVPNLRSGECQLAAVDGGQAGSFLYCNARSPLGSRV
QALSTDEGTSFLPAERVASLPETAWGCQGSIVGFPAPAPNRPRDDSWSVGPRSPLQPPLLGPGVHEPPEE
AAVDPRGGQVPGGPFSRLQPRGDGPRQPGPRPGVSGDVGSWTLALPMPFAAPPQSPTWLLYSHPVGRR
ARLHMGIRLSQSPLDPRSWTEPWVIYEGPSGYSDLASIGPAPEGGLVFACLYESGARTSYDEISFCTFSLR
EVLENVPASPKPPNLGDKPRGCCWPS
HW hydrophilicity values (calculated by the program):
-lJ, -lJ, OJ, OJ, -0.5, -0 .5, -2.5, 00, 3.0, -34, -1 .8, 0 J, -1 J, 00, -1 .5, 00, 30, -04, 00, OJ, 3.0, -0.4, -1.5, -1 .8, -2 .5, 30, 30, 3.0, 3.0, -04, 00, -1.8, -04, -2J, 3.0, -1.5, 0.0, OJ, -1.8, -1.8, 00, -1.5, 00, 00, 00, 00, -0.4, -1.8, -1.8, -0.5, -2.5, -1.5, 30, 02, 30, -1.8, 0 J, 00, 3.0, 30, OJ, -0.5, -0 .5, -0.5, 3.0, -1.8, -1.5, -1.8, 30, 3.0, 00, -04, -1 .8, -0.5, 00, 0.0, OJ, -1.5, 3.0, -34, 00, -0.5, -1.8, -0.5, -1.5, -1.8, 00, -04, -0.5, -0.5, -1.8, -0.5, 3.0, -0.5, 30, OJ, -13, 02, 00,- 1.0, 0.0, -1.5, -0.5, 30, -0.5, 0.0, -04, 0.0, -0.4, -1.5, -2.5, -18, -2 .5, -2 .5, -18, -0.5, -1.5, -18, 00, -0.5, -OA, 00, 3.0, -0 .5, -1 .5, 02, -1 .8, -0.5, -04, 00, 30, 02, -0.5, -0.5, 30, -1.8, -1 .0, -10, -1.5, -0 .5, 0.3, 3.0, 3.0, -0.5, 0 .0, -1.8, 0.3, -3.4, 00, 0 J, -0.5, 30, 3.0, -1.8, -0.4, 30, 30, -0.5,- 1.8,00,00, -0 .5, -1.5, 02, 3.0, -34, -0 .5, -04, -2.5, -0 .5, -1.5, 0 .0, 00, 00, -0.5, 00, -1 .5, 0.2,- 18, 0.0, OJ, 0.0, 30, -18, -18, -1.5, 0.0, -0 .5, -2J, -04, -2J, 30, -1.5, 3.0, 30, 30, 30, -1.0,- 2.5,00,30, -1.8, -1.0, 3.0, -OA, 0.3, 0.0, -0.5, OJ, -2.5, -0.5, -2 .5, -23, 03, 30, 3.0, -0.5, 00, 3.0, -0.4, -3.4, 3.0, -1.0, 0.0,0 .0, -1.8, -1.5, 00, 0.2, -1.8, 30, 03, 0.0, 3.0, -1.0, 0.2, -1.8, -0.5,- 0.5, -1.5, 3.0, 00, 00, 02, -0 .5, 0.0, 03, -2.5, -1 .8, -23, -10, 0.2, -0.5, 3.0, 0.3, 00, -1 .8, 00, OJ, 30, -1.5, 02, -0.5, -18, 0 J, -04, 30, 3.0, 0.0, -04, 0 J, -2.5, -18, 0.0, -0.5, 30, 30, -1.5,- 0.5, OJ, -1.8, 00, 30, -04, -0 .5, -34, 0.0, -10, 0.2,0 .0, OJ, -1.8, -1.5, 00, -2.5, 00, -0.5, 00,- 0.5, 00, 0.2, 3.0, 0.0, 3.0, 3.0,3 .0, 03, -34, 03, -1.5, 0.0, 0.0, 3.0, 03, 00, -1.8, 0.2, 00, 00,- 1.8, -1 .8, 0.0,0 .0, 0.0, -1.5, -0 .5, 3.0, 0.0, 0.0,30, 30, -0 .5, -0 .5, -1 .5, 3.0, 0.0,3 .0, 0.0, 0.0,02, -1.5, 0.0, 0.0, 0.0, 0.0, -25, 03, 30, -1 .8, 02, 0.0, 3.0, 0.0, 30, 0.0, 0.0, 3.0, 02, 0.0, 00, 00, 3.0, 00, 00, -1.5, 03, 0.0, 3.0, -1.5, 0.0,03, -34, -0.4, -1.8, -0.5, -1.8, 0.0, -13, 00, -2.5, -0.5,- 0.5, 00, 00, 0.2, 0.3, 0.0, -0.4, -34, -1.8, -1.8, -2.3, 0.3, -0.5, 00, -1.5, 00, 30, 30, -0.5, 3.0, - 1.8, -0.5, -1 .3, 0.0, -1 .8, 3.0, -1 .8, 03, 02, 03, 00, -1 .8, 3.0, 00, 30, 0.3, -34, -04, 3.0, 0.0, - 3.4, -1.5, -1.8, -2.3, 30, 00, 0.0, OJ, 0.0, -2.3, OJ, 30, -1.8, -0.5, 0.3, -1.8, 0.0, 0.0, -0.5, 00, 3.0,00,00, -1.8, -1.5, -2 .5, -0.5, -10, -1.8, -23, 3.0, 0.3, 00, -0.5, 30, -04, 0.3, -23, 30, 3.0,- 1.8, 03, -2.5, -1 .0, -04, -2.5, 03, -1.8, 3.0, 3.0, -1.5, -1 .8, 3.0, 0.2, -1 .5, 00, -0 .5, 03, 00, 30, 0.0,00, 0.2, -1 .8, 00, 3.0, 3.0, 0.0, 3.0, 0 .0, -1 .0, -1 .0, -3.4, 0.0, OJ
A python based program was implemented to assess the possible exposed and buried regions of a protein sequence based on the hydrophilicity and hydrophobicity values of each amino acid represented as Hopp-Woods plot. A python class was written to test the correctness of the protein sequence submitted in fasta format. Validations were employed and the results are displayed either with a default window size or based on the user requirement. A test case reported in the paper represented by sialidase-4 enzyme supports the outcome of the program.