ISSN: 2155-9899
Research Article - (2012) Volume 0, Issue 0
Mass Spectrometry-based phosphoproteomics tools are critical to understand the structure and dynamics of signalling that engages and migrates through the entire proteome. Approaches such as affinity purification followed by Mass Spectrometry (MS) have been used to elucidate relevant biological questions in health and disease. Thousands of proteins interact via physical and chemical association. Certain proteins can covalently modify other proteins post-translationally. These post-translational modifications (PTMs) ultimately give rise to the emergent functions of cells in sequence, space and time.
Understanding the functions of phosphorylated proteins thus requires one to study proteomes as linked-systems rather than collections of individual protein molecules.
The interacting proteome or protein-network knowledge has recently received much attention, as networksystems (signalling pathways) are effective snapshots in time of the proteome as a whole. MS approaches are clearly essential, in spite of the difficulties of some low abundance proteins (e.g some kinases).
The word of proteomics is defined as the comprehensive analysis of the proteins which are expressed in cells or tissues, and can be employed at different stages (e.g. healthy vs. disease). Thus, comparative proteomics can distinguish small but important changes in protein modifications in their structure –post-translational modifications (PTMs)-at a depth of several thousand proteins to facilitate drug target identification.
Chemical proteomics can be used to identify drug-target interactions and subsequently analyze drug specificity and selectivity. Furthermore, phosphoproteomic approaches can be exploited to monitor changes in phosphorylation events in order to characterize drug actions on cell signalling pathways and/or signalling ascades. In addition, functional proteomic approaches, can be employed to investigate protein-protein and protein-ligand interactions in order to: (a) improve the knowledge or the clarification of the mechanism of drug action, (b) achieve relevant protein-identifications of diseaserelated sub-networks and (c) reach the important step of innovation of novel drug targets.
Moreover, proteins are currently the major drug targets, and therefore play a critical role in the process of modern drug design. This typically involves: (a) the construction of drug compounds based on the structure of a specific drug target, (b) validation for therapeutic efficacy of the drug compounds, (c) evaluation of drug toxicity, and finally, (d) clinical trial [1].
Phosphoproteomics recent advancements MS-based, have made these approaches the ideal way to study signal transduction although it implies high speciality and tedious research studies.
It is important to know that, individual protein phosphorylation events often have important roles and clues in broad signalling networks within a cell. Unfortunately, while phosphorylation of kinases frequently mainly regulates their own activity, they are commoly underrepresented in phosphoproteomic studies, at least in part due to their low expression within the cell. Nevertheless, a viable solution to this drawback has been successfully proven via kinase affinity purification techniques. Thus, important improvements are helping to achieve relevant data of phosphorylated kinases -being those proteins the “key” of signalling pathways and network-connectivity among different signalling cascades.
Phosphatases are playing equally important roles in regulating signalling pathways through the removal of phosphoryl groups from proteins. Indeed, depleting cells of specific protein phosphatases and employing phosphoproteomic approaches, can be used to determine which proteins are regulated by the phosphatase of interest, either directly or downstream [2-7].
General principles of signalling pathways
The best studies of mitogen activated protein kinase (MAPKs) are the extracellular signal regulated protein kinases (ERK). ERKs phosphorylate cytoplasmic targets migrate to the nucleus where they can activate transcription factors involved in cellular proliferation.
As a general view of the orchestrated signalling pathways, it is important to know that followed by communication of the signal to different cellular compartments are (a) signal processing and (b) amplification by plasma membrane proximal events.
The activation of multiple signal cascades by (a) receptors, (b) different protein PTMs, (c) crosstalk between signalling pathways and (d) feedback loops to ensure optimal signalling output, are involved in this process.
The binding of receptor Tyrosine (Tyr) kinases (RTKs) to their cognate ligands at the cell surface results in receptor dimerization and autophosphorylation. Phosphorylated Tyr residues subsequently serve as docking sites to recruit signalling mediators, such as growth factor receptor-bound protein 2 (GRB2).
Multiple signalling cascades such as (a) the phosphoinositide-3 kinase (PI3K)-AKT, (b) Ras-Raf- extracellular signal-regulated kinase (ERK) mitogen-activated protein kinase (MAPK), and (c) signal transducer and activator of transcription (STAT) pathways are activated by the assembly of these signalling complexes.
On the other hand, (d) Casitas B-lineage lymphoma (CBL)- mediated ubiquitylation of RTKs controls their endocytosis and the duration of receptor signalling. In addition, binding of tumour necrosis factor-α (TNFα) to its receptor, TNFR1, induces trimerization of the receptor and recruitment of the adaptor protein TNFR1-associated death domain (TRADD).
These functions as a hub to assemble a multiprotein signalling complex containing TNFR-associated factor 2 (TRAF2), receptor interacting Ser/Thr protein kinase 1 (RIPK1) and nuclear factor-κB (NF-κB) essential modulator (NEMO). The result is the activation of different signalling networks, such as the ERK MAPK, p38 MAPK and NF-κB pathways. Proteins in the MAPK signalling pathways are activated by both RTKs and TNFα, which allows cells to integrate multiple signals. [8-20].
Current and most used phosphoproteomic approaches
Several analytical techniques exist for the analysis of phosphorylation, e.g., Edman sequencing and 32P-phosphopeptide mapping for localization of phosphorylation sites, but these methods do not allow high-throughput analysis or imply very labor intense operations [21], while using MS, high-throughput analysis of phosphorylated protein residues can be developed [22,23]. On the other hand, phosphospecific antibodies are routinely used to immunoprecipitate and therefore to enrich in phosphorylated proteins from complex mixtures [24], but, currently, there are no commercial available antibodies that are suitable for enriching all proteins that are phosphorylated, and thus, these proteins must be purified or enriched from complex mixtures using alternative methods [25]. Carrying out in-gel or in-solution trypsin digestion of protein complex mixtures, the resulted phosphopeptides and non-phosphopeptides can be loaded into different metal ion chromatographies (e.g. Immobilized Metal ion Affinity Chromatography IMAC (Fe3+), and Titanium Dioxide TiO2 [26] in order to enrich in phosphopeptides. The enriched solution can also be submitted into different reverse-phase chromatographies (e.g. Graphite powder [27], POROS R3 [25] in order to clean and desalt those phosphopeptides previously eluted. Moreover, all these kind of chromatographies will reduce the suppression of phosphorylated peptides in the mass spectra.
Using IMAC (Fe3+) and also (TiO2) [26], the negatively charged phosphopeptides are purified by their affinity to positively charged metal ions, but some of these methods suffer the problem of binding acidic, non-phosphorylated peptides. Ficarro and co-workers [22], circumvented this problem on IMAC (Fe3+) by converting acidic peptides to methyl esters but increasing the spectra complexity and requiring lyophilization of the sample, which causes adsorptive losses of especially phosphopeptides [28]. Ficarro et al. [22] were able to sequence hundreds of phosphopeptides from yeast, including Slt2p kinase, but the level of phosphorylated residues identified from kinases were low compared to the ones from phosphoproteins highly expressed within the cell. Recently, TiO2 chromatography using 2,5-dihydroxybenzoic acid (DHB) was introduced as a promising strategy by Larsen et al. [26]. TiO2/DHB resulted in a higher specificity and yield as compared to IMAC (Fe3+) for the selective enrichment of phosphorylated peptides from model proteins (e.g. lactoglobulin bovine, casein bovine).
Another important limitation concerning to the phosphoenrichment methods is that mainly phosphopeptides from highly expressed proteins within cells can be purified, while the ones from phosphorylated proteins with low level expression (e.g. kinases) do not bind so well to those resins. This is due to the low proportion of this kind of proteins, or on the other hand, their available amount binds to metal ions although it is not enough to be detected by MS. The combination of Strong Cation Exchange Chromatography (SCX) with IMAC (Fe3+) has been proven on yeast, resulting in a huge number of phosphorylated residues identified (over 700 including Fus3p kinase) [23]. Although more than 100 signalling proteins and functional phosphorylation sites were identified, including receptors, kinases and transcription factors, it was clear that only a fraction of the phosphoproteome was revealed [23].
It is clear that methodologies to enrich for phosphorylated residues from kinases should be improved. However, this is not straightforward for several reasons: (a) The low abundance of those signalling molecules within cells, (b) The stress/stimulation time-duration, as only a small fraction of phosphorylated kinases are available at any given time as a result of a stimulus. Also, the time adaptation over signalling pathways is a relevant and fast factor for kinases phosphorylation [29], and (c) The current phosphoenrichment methods, which mainly are successful to purify phosphopeptides from highly expressed proteins.
In a simple manner we will detail the manual validation of the phospho-data (assignments of the phosphate group on specific amino acids) obtained in an MS experiment during CID (Collision Induced Dissociation) operations. When peptide ions are fragmented via CID, series of y- and b- ions are formed [30,31].The peptide sequence is obtained by correlating mass difference between peaks in the y-ion series or between peaks in the b-ion series with amino acid residue masses. The CID fragmentation mainly occurs on the peptide backbone, and sequence information is obtained. Related to phosphotyrosine residues, partial neutral loss is observed (HPO3, 80m/z) in MS2 mode, and the phosphate group on tyrosine (Tyr) residues is more stable than on serine (Ser) and threonine (Thr) residues. Also, the phosphofinger- print characteristic of phosphotyrosine is the phosphotyrosine immonium ion (~216 Da) [32,33].Via MS3 mode, the ion originating from neutral loss (NL) of phosphoric acid (H3PO4) can be selected for further fragmentation. Then, the selected ion is automatically selected for further fragmentation after neutral loss fragmentation. Therefore, it is possible to add an extra energy for the fragmentation of peptide backbone. Nevertheless, the MS3 mode requires that the phosphorylation on ser and thr residues are labile and conventional fragmentation via CID commonly results in the partial NL of H3PO4, (98m/z) in MS2 mode. This is due to the gas phase β-elimination of the phosphor-ester bond and thus, dehydroalanine (Ser ~69Da) and dehydro-2-aminobutyric acid (Thr ~83Da) are generated [32,33].
Alternative phosphopeptide enrichment strategies
Phosphopeptides are de-protected and collected under acidic conditions and a variety of chemical methodologies have likewise appeared. BEMA (β-elimination/Michael addition), takes advantage of the ease of β-elimination of phosphorylated Ser and Thr residues at basic pH and the ability to subject the resulting dehydroalanine/ methyl-dehydroalanine products to Michael addition with a desired tag for affinity purification [34-36]. In addition, Calcium phosphate precipitation (CPP) has been proven to be a fast, economical, and simple enrichment technique [37] in exchange for diminished specificity. Moreover, PhosphorAmidate Chemistry (PAC) is another important approach in which phosphopeptides are coupled to a solidphase support such as an amino-derivatized dendrimer or controlledpore glass derivatized with maleimide for selection [38,39].
Tandem MS methodology -basic issues- useful for phosphoproteomics via Electrospray Ionization (ESI)
As a general rule, during MS-based experiments, a phosphopeptide mixture is separated using capillary liquid chromatography (LC). A typical separation column is 25 to 100 microns in diameter and 5 to 30 cm in length. The eluent is concurrently introduced into the mass spectrometer via electrospray ionization (ESI). ESI is a process that generates multiply protonated gas-phase peptide cations. The mass-to-charge ratio (m/z) and intensity (I) of the intact peptide precursors are recorded by an initial MS scan – commonly referred to as a full scan MS. Then, m/z values for peaks (list of masses) with high intensity are automatically selected in order of decreasing abundance for sequencing by tandem MS (MS/MS). This process of precursor selection, dissociation, and fragment ion mass analysis is repetitively performed on analyte species as they elute from the LC column. Ideally, MS/MS interrogation of a phosphorylated peptide generates a series of fragment ions that differ in mass by a single amino acid, such that the peptide primary sequence and position of the phosphorylated modifications can be determined. This necessitates peptide bond cleavage that is not only specific to the peptide backbone, but is robust enough to elucidate differences in peptides whose primary amino acid sequence are the same, yet vary in the site of phosphorylation (i.e., positional isomers) [40]
The dominant NL peak in the fragmentation spectra of phosphopeptides obtained via traditional collisionally induced dissociation (CID) has received much attention [41-43]. The NL peak can easily suppress sequence diagnostic ion peaks causing identification of the peptide to become very difficult and sometimes impossible.
Since the use of ion traps, currently, are the most common mass spectrometers of performing phosphoproteome analyses, there have been various attempts to combat this specific problem. Modified fragmentation regimes have been introduced such as (a) NL triggered MS3 or (b) multistage activation (MSA), which alleviate the neutral loss issue. NL MS3 and MSA methods allow fragmenting of the NL peak of the precursor ion further, in order to generate more backbone cleavages. These “extra” generated backbone cleavages, are forming then the more diagnostic source for peptide sequencing [23,44-46].
Alternatively, Electron transfer dissociation (ETD) and electron capture dissociation (ECD) have also shown great promise since the phosphate group remains attached during and after activationMany detected phosphopeptides contain multiple Ser/Thr/Tyr residues representing the likely possibility that there is more than one possible location for the site of phosphorylation within the peptide. The abundant NL observed in low energy CID can hamper the correct assignment of the phosphor-sites in such peptides. Thus, a concerted effort has been made to understand, in detail, the rules of phosphopeptide fragmentation [47,48-51].
Improved biomarkers are of vital importance for cancer detection, diagnosis and prognosis. While significant advances in understanding the molecular basis of disease are being made in genomics, proteomics will ultimately delineate the functional units of a cell: proteins and their intricate interaction networks and signalling pathways in health and disease.
Much progress has been made to characterize thousands of proteins qualitatively and quantitatively in complex biological systems by use of multi-dimensional sample fractionation strategies, MS and protein micro-arrays. Comparative/quantitative analysis of highquality clinical biospecimen (e.g., tissue and biofluids) of human cancer proteome landscape can potentially reveal protein/peptide biomarkers responsible for this disease by means of their altered levels of expression, PTMs as well as different forms of protein variants. Despite technological advances in proteomics, major hurdles still exist at every step of the biomarker development pipeline [52-63].
In the post-genome era, the field of proteomics incited great interest in the pursuit of protein/peptide biomarker discovery especially since MS demonstrated the capability of characterizing a large number of proteins and their PTMs in complex biological systems, in some instances even quantitatively. Technological advances such as protein/ antibody chips, depletion of multiple high abundance proteins by affinity columns, and affinity enrichment of targeted protein analytes as well as multidimensional chromatographic fractionation, have all expanded the dynamic range of detection for low abundance proteins by several orders of magnitude in serum or plasma, making it possible to detect the more abundant disease-relevant proteins in these complex biological matrices [63-71]. However, plasma and cell-extract based discovery research studies aimed to identify low abundance proteins (e.g. some kinases) are extremely difficult. Therefore, it is necessary to develop significant technological improvements related to identifying these low abundance, yet high biological impact molecules. Moreover, if these protein kinases to be studied contain PTMs, it is important to know that spatial and temporal factors can decrease the efficiency of our study (e.g. many kinases are regulated by phosphorylation of the activation loop, which then directly reflects cellular kinase activity).
Furthermore, proteomics has been widely applied in various areas of science, ranging from the deciphering of molecular pathogenesis of diseases, the characterization of novel drug targets, to the discovery of potential diagnostic and prognostic biomarkers, where technology is capable of identifying and quantifying proteins associated with a particular disease by means of their altered levels of expression [72-74] and/or PTMs [75-77] between the control and disease states (e.g.., biomarker candidates). This type of comparative (semi-quantitative) analysis enables correlations to be drawn between the range of proteins, their variations and modifications produced by a cell, tissue and biofluids and the initiation, progression, therapeutic monitoring or remission of a disease state.
PTMs including phosphorylation, glycosylation, acetylation and oxidation, in particular, have been of great interest in this field as they have been demonstrated as being linked to disease pathology and are useful targets for therapeutics.
In addition to MS-based large-scale protein and peptide sequencing, other innovative approaches including self-assembling protein microarrays [78] and bead-based flow cytometry [79,80] to identify and quantify proteins and protein-protein interaction in a high throughput manner have furthered our understanding of the molecular mechanisms involved in diseases.
In summary, clinical proteomics has come a long way in the past decade in terms of technology/platform development, protein chemistry, and together with bioinformatics to identify molecular signatures of diseases based on protein pathways and signalling cascades. Hence, there is great promise for disease diagnosis, prognosis, and prediction of therapeutic outcome on an individualized basis. However, without correct study design and implementation of robust analytical techniques, the efforts and expectations to make biomarkers a useful reality in the near future can easily be hindered.
Phosphorylation is a key reversible modification that regulates protein function, subcellular localization, complex formation, degradation of proteins and therefore cell signalling networks. With all of these modification results, it is assumed that up to 30% of all proteins may be phosphorylated, some multiple times.
Phosphoproteomics is a branch of proteomics that identifies, catalogs, and characterizes proteins containing a phosphate group as a PTM. Moreover, phosphoproteomics provides clues on what protein or pathway might be activated because a change in phosphorylation status almost always reflects a change in protein activity. Indeed, it can indicate what proteins might be potential drug targets as exemplified by the kinase inhibitor. While phosphoproteomics will greatly expand knowledge about the numbers and types of phosphoproteins, its greatest promise is the rapid analysis of entire phosphorylation based signalling networks.
Nevertheless, methodologies to enrich for phosphorylated residues from kinases should be improved, especially due to their low abundance of those signalling molecules within cells.
The authors declare that they have no competing interests.
Authors EL JLP and JS carried out Clinical Proteomics studies for this shortreview in order to develop future Clinical Phosphoproteomics research studies and publish this article. All authors read and approved the final manuscript.
EL is PhD and a recipient of a Post-doctoral fellowship of Ministerio de Ciencia e Innovacion de España.
JLP and JS are MD PhD and hold a tenured position at Hospital Universitario 12 de Octubreand Carlos III respectively.