Journal of Proteomics & Bioinformatics

Journal of Proteomics & Bioinformatics
Open Access

ISSN: 0974-276X

Case Report - (2011) Volume 4, Issue 7

A Python Based Hydrophilicity Plot to Assess the Exposed and Buried Regions of a Protein

P. Pandarinath*, A. Appa Rao and G. Lavanya Devi
Department of Computer Sciences, College of Engineering (A), Andhra University, Visakhapatnam, India
*Corresponding Author: P. Pandarinath, Department of Computer Sciences, College of Engineering (A), Andhra University, Visakhapatnam, India

Abstract

Predicting the antigenic regions of a protein is of prime importance in assessing the states of a polypeptide chain as exposed or buried regions. This can be achieved by calculating and plotting the hydrophilicity of a protein using values of Hoop-Woods scale. Hence, in this paper, we report a hydrophilicity plot of Hoop-Woods scale amino acid sequence of a protein on its x-axis, and degree of hydrophobicity or hydrophilicity on its y-axis using python language as architecture by utilizing various functional attributes such as scipy, matplot and numpy modules. A case study has been reported using siallidase-4 enzyme. The program code can be available on request.

Keywords: Python, Hydropathy, Hydrophobicity, Hydrophilicity.

Introduction

It has been reported in literature that prediction of antigenic regions in a protein will be helpful for a rational approach to the synthesis of peptides which may elicit antibodies reactive with the intact protein [1]. Various methods have been devised to locate hydrophilic regions in a protein [2] and it was earlier reported that antigenic determinants are surface located and often contain charged and polar residues. However, not all antigenic regions are hydrophilic and not all hydrophilic regions are antigenic [3]. A hydropathy plot is a quantitative analysis of the degree of hydrophobicity or hydrophilicity of amino acids of a protein. It is generally used to assess the possible exposed or buried domains of a protein and values are assigned from Hopp-Woods scale. The plot has amino acid sequence of a protein on its x-axis, and degree of hydrophobicity or hydrophilicity on its y-axis. Here, in this paper we report the hydropathy plot developed using object-oriented python language as architecture and utilized various functional attributes related to python such as scipy, matplot and numpy modules [4,5].

Materials and Methods

The hydrophilicity plot is developed as an object-oriented python program from which the users can submit the fasta format of the protein sequence and validations have been written to address any errors in the input. The program was presented in two aspects, first to read a given fasta file using read_fasta.py program and the second, hydrophilicity_plot.py

In general, hydrophobicity or hydrophilicity plots are designed to find out the polar and apolar residues of a given protein sequence. However, certain residues that span through the membrane are highly hydrophobic in nature and few other residues exposed on the surface of proteins are hydrophilic. The most commonly used hydrophilicity scale was utilized in the program, given in Table 1.

Amino acids Hopp-Woods
Alanine -0.5
Arginine 3
Asparagine 0.2
Aspartic acid 3
Cysteine -1
Glutamine 0.2
Glutamic acid 3
Glycine 0
Histidine -0.5
Isoleucine -1.8
Leucine -1.8
Lysine 3
Methionine -1.3
Phenylalanine -2.5
Proline 0
Serine 0.3
Threonine -0.4
Tryptophan -3.4
Tyrosine -2.3
Valine -1.5

Table 1: Hydrophilicity scales of Hopp-Woods.

Hopp-Woods scale, an essential hydrophilic index was developed to predict potential antigenic sites of globular proteins rich in charged and apolar residues. A negative value is assigned for apolar residues. The program calculates each residue value from the input fasta file and a moving window size of 6 was employed by default. Values greater than 0 represents hydropathic region whereas negative values represent hydrophobic regions.

Case study

 A case study has been reported here to assess the ability of the program to detect hydrophilic and hydrophobic regions of a protein by Hopp-Woods scale. Sialidase-4 protein sequence from our study, where we earlier reported possible viral sequence inserts in human genome [6] was selected in the study. The hydrophilicity values calculated from Hopp-woods scale for sialidase-4 enzyme sequence as input is given below and plot in Figure 1. Predicted hydrophilicity values are given below for the entire length of sequence.

proteomics-bioinformatics-hydrophilicity

Figure 1: Hydrophilicity plot of sialidase-4 enzyme.

>tr|Q3KR05|Q3KR05_HUMAN Sialidase 4 OS=Homo sapiens GN=NEU4 PE=2 SV=1

MMSSAAFPRWLSMGVPRTPSRTVLFERERTGLTYRVPSLLPVPPGPTLLAFVEQRLSPDD

SHAHRLVLRRGTLAGGSVRWGALHVLGTAALAEHRSMNPCPVHDAGTGTVFLFFIAVLGHTPEAVQI

ATGRNAARLCCVASRDAGLSWGSARDLTEEAIGGAVQDWATFAVGPGHGVQLPSGRLLVPAYTYRV

DRRECFGKICRTSPHSFAFYSDDHGRTWRCGGLVPNLRSGECQLAAVDGGQAGSFLYCNARSPLGSRV

QALSTDEGTSFLPAERVASLPETAWGCQGSIVGFPAPAPNRPRDDSWSVGPRSPLQPPLLGPGVHEPPEE

AAVDPRGGQVPGGPFSRLQPRGDGPRQPGPRPGVSGDVGSWTLALPMPFAAPPQSPTWLLYSHPVGRR

ARLHMGIRLSQSPLDPRSWTEPWVIYEGPSGYSDLASIGPAPEGGLVFACLYESGARTSYDEISFCTFSLR

EVLENVPASPKPPNLGDKPRGCCWPS

HW hydrophilicity values (calculated by the program):

 -lJ, -lJ, OJ, OJ, -0.5, -0 .5, -2.5, 00, 3.0, -34, -1 .8, 0 J, -1 J, 00, -1 .5, 00, 30, -04, 00, OJ, 3.0, -0.4, -1.5, -1 .8, -2 .5, 30, 30, 3.0, 3.0, -04, 00, -1.8, -04, -2J, 3.0, -1.5, 0.0, OJ, -1.8, -1.8, 00, -1.5, 00, 00, 00, 00, -0.4, -1.8, -1.8, -0.5, -2.5, -1.5, 30, 02, 30, -1.8, 0 J, 00, 3.0, 30, OJ, -0.5, -0 .5, -0.5, 3.0, -1.8, -1.5, -1.8, 30, 3.0, 00, -04, -1 .8, -0.5, 00, 0.0, OJ, -1.5, 3.0, -34, 00, -0.5, -1.8, -0.5, -1.5, -1.8, 00, -04, -0.5, -0.5, -1.8, -0.5, 3.0, -0.5, 30, OJ, -13, 02, 00,- 1.0, 0.0, -1.5, -0.5, 30, -0.5, 0.0, -04, 0.0, -0.4, -1.5, -2.5, -18, -2 .5, -2 .5, -18, -0.5, -1.5, -18, 00, -0.5, -OA, 00, 3.0, -0 .5, -1 .5, 02, -1 .8, -0.5, -04, 00, 30, 02, -0.5, -0.5, 30, -1.8, -1 .0, -10, -1.5, -0 .5, 0.3, 3.0, 3.0, -0.5, 0 .0, -1.8, 0.3, -3.4, 00, 0 J, -0.5, 30, 3.0, -1.8, -0.4, 30, 30, -0.5,- 1.8,00,00, -0 .5, -1.5, 02, 3.0, -34, -0 .5, -04, -2.5, -0 .5, -1.5, 0 .0, 00, 00, -0.5, 00, -1 .5, 0.2,- 18, 0.0, OJ, 0.0, 30, -18, -18, -1.5, 0.0, -0 .5, -2J, -04, -2J, 30, -1.5, 3.0, 30, 30, 30, -1.0,- 2.5,00,30, -1.8, -1.0, 3.0, -OA, 0.3, 0.0, -0.5, OJ, -2.5, -0.5, -2 .5, -23, 03, 30, 3.0, -0.5, 00, 3.0, -0.4, -3.4, 3.0, -1.0, 0.0,0 .0, -1.8, -1.5, 00, 0.2, -1.8, 30, 03, 0.0, 3.0, -1.0, 0.2, -1.8, -0.5,- 0.5, -1.5, 3.0, 00, 00, 02, -0 .5, 0.0, 03, -2.5, -1 .8, -23, -10, 0.2, -0.5, 3.0, 0.3, 00, -1 .8, 00, OJ, 30, -1.5, 02, -0.5, -18, 0 J, -04, 30, 3.0, 0.0, -04, 0 J, -2.5, -18, 0.0, -0.5, 30, 30, -1.5,- 0.5, OJ, -1.8, 00, 30, -04, -0 .5, -34, 0.0, -10, 0.2,0 .0, OJ, -1.8, -1.5, 00, -2.5, 00, -0.5, 00,- 0.5, 00, 0.2, 3.0, 0.0, 3.0, 3.0,3 .0, 03, -34, 03, -1.5, 0.0, 0.0, 3.0, 03, 00, -1.8, 0.2, 00, 00,- 1.8, -1 .8, 0.0,0 .0, 0.0, -1.5, -0 .5, 3.0, 0.0, 0.0,30, 30, -0 .5, -0 .5, -1 .5, 3.0, 0.0,3 .0, 0.0, 0.0,02, -1.5, 0.0, 0.0, 0.0, 0.0, -25, 03, 30, -1 .8, 02, 0.0, 3.0, 0.0, 30, 0.0, 0.0, 3.0, 02, 0.0, 00, 00, 3.0, 00, 00, -1.5, 03, 0.0, 3.0, -1.5, 0.0,03, -34, -0.4, -1.8, -0.5, -1.8, 0.0, -13, 00, -2.5, -0.5,- 0.5, 00, 00, 0.2, 0.3, 0.0, -0.4, -34, -1.8, -1.8, -2.3, 0.3, -0.5, 00, -1.5, 00, 30, 30, -0.5, 3.0, - 1.8, -0.5, -1 .3, 0.0, -1 .8, 3.0, -1 .8, 03, 02, 03, 00, -1 .8, 3.0, 00, 30, 0.3, -34, -04, 3.0, 0.0, - 3.4, -1.5, -1.8, -2.3, 30, 00, 0.0, OJ, 0.0, -2.3, OJ, 30, -1.8, -0.5, 0.3, -1.8, 0.0, 0.0, -0.5, 00, 3.0,00,00, -1.8, -1.5, -2 .5, -0.5, -10, -1.8, -23, 3.0, 0.3, 00, -0.5, 30, -04, 0.3, -23, 30, 3.0,- 1.8, 03, -2.5, -1 .0, -04, -2.5, 03, -1.8, 3.0, 3.0, -1.5, -1 .8, 3.0, 0.2, -1 .5, 00, -0 .5, 03, 00, 30, 0.0,00, 0.2, -1 .8, 00, 3.0, 3.0, 0.0, 3.0, 0 .0, -1 .0, -1 .0, -3.4, 0.0, OJ

Conclusion

A python based program was implemented to assess the possible exposed and buried regions of a protein sequence based on the hydrophilicity and hydrophobicity values of each amino acid represented as Hopp-Woods plot. A python class was written to test the correctness of the protein sequence submitted in fasta format. Validations were employed and the results are displayed either with a default window size or based on the user requirement. A test case reported in the paper represented by sialidase-4 enzyme supports the outcome of the program.

References

  1. Welling GW, Weijer WJ, Van der Zee R, Welling-Wester S (1985) Prediction of sequential antigenic regions in proteins. FEBS Lett 188: 215-218.
  2. Hopp TP, Woods KR (1981) Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci 78: 3824-3828.
  3. Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157: 105-132.
  4. Pandarinath P, Shashi M, Rao AA (2010) Computational study of viral segments inserted within the regions of human genome. J Comput Sci Syst Biol 3: 74-75.
Citation: Pandarinath P, Rao AA, Devi GL (2011) A Python Based Hydrophilicity Plot to Assess the Exposed and Buried Regions of a Protein. J Proteomics Bioinform 4: 145-146.

Copyright: © 2011 Pandarinath P, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Top