ISSN: 0974-276X
Short Article - (2011) Volume 4, Issue 6
We have developed an easy-to use methodology for refining large extensible markup language (XML) - based proteomics dataset with a high stringent and simple approach using VBA- coded plug-in. A methodology we term it (All and None). Selections of targeted candidates differentially significant between compared groups were selected based on its appearance or absence followed by peptide screening with a novel and simple approach. By testing the reliability and efficiency of this method, All and None was confirmed to be an applicable process for initial screening of biological biomarkers in complex specimens and tissue extract.
Keywords: Refining dataset, Proteomics, VBA code.
How many times we became confused when looking for the correct significant protein hits within those dozens of proteins list? Did you try "All and None"?
With the current revolution of proteomics machinery, high performance and sensitivity of mass spectrometry, it became usual trend to obtain results with increasing number of protein hits day after a day. On the same time, "junk" proteins are also added in parallel. While one of the most important challenges has been, and still remains, the difficulty and doubting task of finding out the correct protein hit of interest within these huge data outputs, the demand of high stringent filtering process might be necessary "sometimes" in order to rapid screen the highly significant proteins we might be interested in. Aside from the quantitative labeling methods that aim to find out differentially significant candidates (i.e. ICAT [1], iTRAQ[2], and SILAC [3]) or other label-free methods that use certain algorithms depending on peptide counts in relation to protein length or molecular weight (i.e. spectral count [4], emPAI [5]. PAF [6], and APEX [7]) for the same purpose, we created a simple method for initial refining large scale proteomic data and we term it "All and None" [8]. This refining strategy relays mainly on selecting protein candidates with high confidence under stringent condition using a visual basic macro plugin followed by manual peptide filtering.
Principle of “All and None” comparative strategy “How it works?”
Simply, as shown in Figure 1, by initial searching for the candidates shared in a replicate runs of one experimental group [All] and, at the same time, absent elsewhere in the counterpart group [None]. Assuming unbiased mass analysis, selection of shared protein candidates in one group will ascertain the consistency and relative high abundance (when compared to counterpart group) of a given protein. Meanwhile, the absence of the same candidate in the counterpart group (not necessary shared but even in a single run as well) indicates its low abundance or lack of translation. Practically, search can be done in a simple and robust way by activating our pre-prepared visual basic code (VBA) within excel spreadsheet (supplement 1), pasting VBA code and start comparison. Any desired database identifier could be used for comparison (IPI, Uniprot, GeneID, etc.). In the next step, significantly abundant proteins (based on its visualization in a run) will be searched for its quality based on peptide scoring and Ms/Ms spectra which is expressed finally as protein score. To this point, the researcher can set his desired criteria for accepting or rejecting those proteins. In our experience, we confirmed the high confidence, high quality of a given protein when its corresponding peptides mostly over 2 and its scores were above identity and homology threshold of MOWSE algorithm [8,9] used in Mascot search [10]. High through put and accurate orbitrap mass spectrometry allows nowadays accepting protein with single peptide identification. These above mentioned two filters assure selecting highly significant and confidence protein candidates. In a previous experiment to validate efficiency and accuracy of this method, a comprehensive comparison between both wild type and aquaporin 8 knockout mice proteomes using All and None showed significantly down regulation of α amylase 2 on both transcriptional and translational levels [11].
Figure 1: Schematic brief of All and None refining methodology; proteomic raw outputs are converted into extensible markup language (XML) files, VBAcoded macro plug-in is integrated into the excel program to start comparison of these files and find out the differentially significant protein candidate (biological marker). In a second step, corresponding peptide(s) of those proteins were checked to select reliable biomarkers for immunological confirmation.
Pros and cons
The usefulness of "All and None" search strategy lies in its easy-touse methodology. Added that, to our knowledge, high stringency used in this approach is usually accompanied by successful confirmation when validated immunologically (Figure 2). Flexibility in setting up criteria for accepting a significant protein based on its peptides is also applicable according to researcher's vision. Finally, rapid selection of top differentially significant protein which is more liable to be a biomarker candidate is the most important feature in this approach. While "All and None" provides a shortcut way to find out differentially targeted proteins-based biomarker, one should be aware on the drawbacks of using such approach. For instance, the simplicity of the approach makes it basically a tool for initial screening and away from a comprehensive speculation. Moreover, other proteins which can be detected in both experimental groups with less significance will be ignored. Indeed, those might be important and still significant. Moreover, researchers should note that successful picking of possibly significant biomarker using All and None depend greatly on the quality of mass analysis result [as All and None starts where Mass spectrometry ends] Finally, it seems to us noteworthy to mention that "All and None" from its name do not provide any quantitative information about over/ under regulated proteins; instead, it scans for possible biomarkers as initial step.
Figure 2: Differentially proteomic shotgun analysis of wild type and aquaporin (AQP) 8 knockout mice colon using All and None strategy showed significant down-regulation of α amylase 2 on both protein (A) and mRNA levels (B) using western blot analysis and real time PCR (Syber green), respectively. CW; wild type colon, CK; AQP8 knockout colon. Bars represent arbitrary unit (AQP8/ GAPDH). Error bars represent S.E.M.
Applying "All and None" searching strategy enables us to select differentially significant proteins that most possibly found in one group with multiple folds difference when compared to the counterpart group. In the later, same candidate is not shown in the search due to its absence or very low abundance. In a next step, quality of those protein candidates is checked through their peptide scoring and Ms/ Ms spectra to assure high confidence. This approach can be helpful in fishing candidate of interest from Proteomic Ocean in an easy way.
Financial & competing interest disclosure
This work was supported by JSPS (Japan Society for Promotion of Science) Grant-in-Aid for scientific research (B) to Dr. SM (23790933) from Ministry of Education, Culture, Sports, Science and Technology of Japan. "All and None" is an approach created and published by S.M. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the work.