ISSN: 0974-276X
Research Article - (2008) Volume 1, Issue 6
Cathepsin L is a cysteine protease which degrades connective tissue proteins like collagen, elastin and fibronectin. Increase in the expression of cathepsin L in aged kidney leading to considerable loss of organ function in old age. Recently it has been reported that SARS-CoV or SARS-CoV spike protein-pseudotyped retroviruses utilize the enzymatic activity of endosomal cathepsin L protease for viral entry. A 3D structure of rat cathepsin L was constructed in this report through homology modeling using the X-ray structure of procathepsin L from Homo sapiens (PDB code: 1CS8). The homology modeling was done by using the MODELLER 9v2 software. The final model obtained by molecular mechanics and dynamics method and was assessed by PROCHECK and VERIFY 3D graph, which showed that the final refined model is reliable. The model could be further explored for characterizing the protein.
Keywords: Homology modeling; Cathepsin L; Protease; Aging; Structural Bioinformatics
EM- Energy Minimization, BLAST- Basic Local Alignment Search Tool
Cathepsin L is a member of the papain superfamily of lysosomal cysteine proteases and is one of the most powerful endopeptidases. Its usual function is regulating cellular protein turnover in lysosome (Kirschke H et al., 1995; Kazunobu T et al., 2004; Kramer G et al., 2007). It plays an important role with cathepsin B and H in the degradation of both endogenous and exogenous proteins. Cathepsin L, initially translated as preprocathepsin L, is then transferred through the Golgi as procathepsin L and stored in lysosomes as mature cathepsin L. (Chauhan SS et al., 1993). Over expression of procathepsin L in human melanoma cells increases their tumorigenicity and switches their phenotype from non-metastatic to highly metastatic (Nathalie R et al, 2004). Therefore the enforced expression and secretion of procathepsin L by human melanoma cells arms them with the ability to inactivate complement-mediated cell lyses and contributes to tumor growth and metastasis (Frader R et al ., 1998). Cathepsin L is found to be upregulated in rat kidney during aging (Debata et al, 2007). Cathepsin-L influences the expression of extracellular matrix in lymphoid organs and plays a role in the regulation of thymic output and of peripheral T cell number (Lombardi et al., 2005). It was reported that in human SARS-CoV or SARS-CoV spike protein- pseudotyped retroviruses utilize the enzymatic activity of endosomal cathepsin L protease for viral entry (Huang I et al., 2006; Li F et al., 2006). Cathepsin L is a lysosomal cysteine protease that digests proteins of both intracellular and extracellular origin. It is translated as a precursor protein pre-procathepsin L, transferred through the Golgi apparatus as procathepsin L and then stored in lysosomes as mature cathepsin L (Ishidoh K et al ., 1998). It plays a diverse role in different organs and tissues such as maintenance of heart structure and function (Stypmann J et al., 2002), epidermal differentiation, hair follicle morphogenesis and cycling (Benavides F et al., 2002), development of type 1 diabetes in NOD mouse (Maehr R et al, 2005), tumor metastasis (Lah TT et al ., 1998), thyroid function (Friedrichs B et al,., 2003), a modifier of extra cellular matrices in precomovulatory follicles (Salustri et al., 1999), degradation of basement membrane in kidney (Baricos WH et al., 1998), podocyte migration in nephrotic syndrome (Reiser J et al., 2004), regulation of thymic output and peripheral T cell number (Lombardi et al., 2005) and generation of MHC class I I-bound peptide ligands presented by cortical thymic epithelial cells (Honey K et al.,2002). Our previous study has documented that the expression of cathepsin L gene is significantly up-regulated in rat kidney during aging (Debata PR et al., 2007). A similar result was also demonstrated at protein level (Kim CH et al., 2004). The upregulation of cathepsin L is also found in skeletal muscle wasting in septic muscle (Deval C, et al., 2001) and in scrapie-infected Neuro2a cells (Zhang Y et al., 2003).
In this communication, an effort was made to generate a three-dimensional (3D) model of cathepsin L protein based on the available template crystal structure of procathepsin L from protein data bank (PDB code: 1CS8) (Berman HM et al., 2000). The structural information of cathepsin L could prove useful to further characterizing the protein.
Comparative Modeling of Rat Cathepsin L
The amino acid sequence of rat cathepsin L was retrieved from the sequence database of NCBI (ID: AAH63175). It was ascertained that the three-dimensional structure of the protein was not available in Protein Data Bank, hence the present exercise of developing the 3D model of the rat cathepsin L was undertaken.
BLAST (Altschul S F et al., 1990) search was performed against Brookhaven Protein Data Bank (PDB) with the default parameters to find suitable templates for homology modeling. Sequences were aligned and the one that showed the maximum identity with high score and lower e-value and 73% sequence identity was used as a reference structure to build a 3D model for rat cathepsin L. The rat cathepsin L structure was modeled by means of comparative modeling procedure using the 1CS8 as the template. The rat cathepsin L sequence was submitted to Genesilico protein fold-recognition metaserver. Fold-recognition server Fugue and 3D PSSM reported 1CS8 as the best template with highly significant score. The sequence alignment of rat cathepsin L and 1CS8 was carried out using the CLUSTAL W (Thompson J D et al., 1994) (http://www.ebi.ac.uk/ clustalw) program. The alignment was manually refined at some loop regions of the template. The academic version of MODELLER 9v2 (Sali A et al., 1993) was used for model building. Backbone of the core regions of the protein were transferred directly from the corresponding coordinates of 1CS8. Side chains confirmation for backbone residues was generated automatically by homology. Out of 20 models generated by MODELLER, the one with the best G-score of PROCHECK (Laskoswki R A et al, 1993) and with the best VERIFY3D (Luthy R et al., 1992) profile was subjected to energy minimization. The distance-dependent dielectric constant Î = 1.0 and non binding cutoff of 14 Å, CHARM (Brooks et. al., 1993) force field and CHARMall- atom charges were used for the energy minimization. Initially an 800 step steepest descent algorithm was used to remove close Van der waals contacts, followed by the 1000 iteration conjugate gradient minimization until the maximum derivative is less than 20.0 kcal.mol-1. nm-1. All hydrogen atoms were included during the calculation. The above energy minimization was started with the core main chain, then all the core side chains. All calculations were performed by using Accelrys DS Modeling 2.0, (Accelrys Inc. San Diego, CA 92121, USA) software suite. During these steps the quality of the initial model was improved. VERIFY3D was used to check the residue profiles of the three-dimensional models. In order to assess the stereo-chemical qualities of the three dimensional models PROCHECK analysis was performed and Ramachandran plot was drawn.
BLAST (Altschul S F et al., 1990) search was performed against Brookhaven Protein Data Bank (PDB) with the default parameters to find suitable templates for homology modeling. Based on the maximum identity with high score and lower e-value in the BLAST search, 1CS8 (PDB code) is used as the structural template for modeling the rat cathepsin L protein. The sequence – structure alignment used for model building shown in Figure 1. The alignment is characterized by some insertions and deletions in the loop regions. Since the first 17 residues from the N-terminal end did not have corresponding equivalent regions in 1CS8, the modeling was carried out from the 18th to the 317th residue, followed by a rigorous refinement of the model by means of EM and the final stable structure of the rat cathepsin L obtained is shown in Figure 2. The model has 89% of the residues in the most favored regions of the Ramachandran Map Figure 3 with a PROCHECK G-score value of 0.03 and a satisfactory VERIFY-3D profile. The predicted 3-D model of rat cathepsin L protein will be very useful while studying the real structure of the protein.
Validation of the model was carried out after the refinement process using Ramachandran Map calculations computed with the PROCHECK program. The Φ and ψdistributions of the Ramachandran Map of non-glycine, non-proline residues are summarized in Figure 3 and table 1. The model has 89% of the residues in the most favored regions of the Ramachandran Map with a PROCHECK G-score value of 0.03 and a satisfactory VERIFY3D profile.
Residue in most favored regions | 89.0% |
Residue in the additionally allowed zones | 10.7% |
Residue in the generously regions | 0.00% |
Residue in disallowed regions | 0.3% |
Non-glycine and non-proline residues | 100. % |
Table1: Ramachandran plot calculation for 3D model of rat cathepsin L computed with the PROCHECK program.
The structural superimposition of C trace of template and rat cathepsin L is shown in Figure 4. The weighted root mean square deviation of C trace between the template and final refined model was 0.43 Å which suggest that the model
The amino acid sequences of template and final structure are generated using JOY server (protein sequencestructure representation and analysis (Mizuguchi K et al ., 1998),were aligned using CLUSTAL W. Given their PDB files, secondary structures were also analyzed and compared by the JOY program. The secondary structures of template and final model of rat cathepsin L are highly conserved which showed that final model is highly reliable as shown in Figure 5.
Figure 5: Structure based sequence alignment of template and final structures of the rat cathepsin L using JOY program. The key to the JOY annotation is as follows: lowercase red letter, α-helix; lowercase blue letter, β-strand; lowercase maroon letter 310-helix; uppercase letter, solvent-inaccessible residue; lowercase letter, solvent-accessible residue; italic lowercase letter, positive .
In this report, a molecular model of rat cathepsin L protein has been constructed through homology modeling which could be used for further characterization.
This work was supported by funds from Distributed Information Sub-Center (BT/BI/04/058/2002), Department of Biotechnology, Ministry of Science and Technology, Government of India and Institutional funds of the Institute of Life Sciences, Bhubaneswar, India.