ISSN: 0974-276X
Research Article - (2018) Volume 11, Issue 1
Introduction: Quantitative proteomics approaches have provided insight into biomarkers of cancer and other diseases with high sensitivity, high specificity, and high analytical precision. Multiple Myeloma is an incurable, fatal blood cancers characterized by clonal expansion of plasma cells in the bone marrow. Current multiple myeloma proteomic research mainly focuses on serum biomarkers, not plasma cells, due to technical difficulties including a requirement for tumor cell isolation from bone marrow aspirates, tumor cell paucity and poor in vitro survival after isolation.
Materials and methods: A global proteomic analysis was performed using sorted bone marrow plasma cells from normal donors and multiple myeloma patients and a large-scale quantitative mass spectrometry platform. A selected panel of up- and down-regulated proteins were validated by multiple-reaction-monitoring.
Results: We identified a panel of 18 up- and down-regulated potential biomarkers of multiple myeloma, which can be further clinically validated for their potential use as disease-specific biomarkers or signature molecules for monitoring disease progression.
Conclusion: The study demonstrates a good example of using proteomics as a tool for the development of clinical biomarkers for diagnosis, prognosis, and drug target discovery.
Keywords: Multiple myeloma; Biomarker; Mass spectrometry
Multiple myeloma (MM), the second most common blood cancer, continues to be incurable, despite treatment advancement [1]. The communication between cancer cells and bone marrow niche drive the disease phenotypes and treatment response. Detailed analyses of both the cancer cells and niche are therefore most crucial to new target identification and a cure. Omics technology has helped expanding the knowledge of cancer biology and identifying new targets for therapy [2- 5]. Complementary with genetic analysis and gene expression profiling [6,7], proteomics allows the studies of global protein expression, posttranslation modifications, protein-protein interactions, and ultimately protein functions [8,9]. However, to date the advance of proteomics in MM research lags significantly behind the other mentioned technologies. Proteomic studies in myeloma has been largely limited to serum profiling [10-13]. The clinical applicability of the few published reports on cellular profiling of MM cells are hampered at least in part by the following: 1) the different techniques and statistical platforms used by different investigators, hence not allowing cross-validation; 2) the lack of standard procedures for sample collection, storage, and processing, hence causing background noises and lowered sensitivity of the test; and 3) limited sample numbers, hence lacking statistical power for clinical correlation analysis. In addition, to date proteomic profiling in MM has mostly been based on cell lines in culture [14,15], which may not be representative of primary samples. Primary MM cells are not sustainable in culture due to their crucial reliance on bone marrow niche. Therefore, immediate sample processing is required to avoid background noises from cell apoptosis.
Ideal methodology to isolate MM cells from other bone marrow cells should be simple, quick and yield high purity. MM cells do not tolerate thawing process. Once MM cells are isolated, storage methodology is required to maximize the protein quality for the subsequent proteomic analysis. The advance of proteomics in MM will be greatly facilitated by a standardized sample collecting procedure and established clinical database.
Even though a tremendous effort has been made to improve proteomics technologies [16], there are still numerous challenges associated with even the most advanced technologies for analysis of global protein expression or post-translational modifications. These challenges include: 1) sensitivity, resolution, and accuracy of the instrument and ability to identify novel proteins; 2) the ability to achieve moderate to high throughput; 3) the ability to achieve broad coverage of protein mass and abundance (dynamic range); and 4) the ability to quantitatively analyze protein expression with high precision. At present, there is no consensus within the field of proteomics that any one technology can attain a complete and quantitative protein coverage of all proteins in a given tissue or biofluid. While two-dimensional gel electrophoresis (2DE) platform is still been used by some labs in proteomics research, its lack of ability to widen the protein dynamic range and its labor-intensiveness remain major disadvantages. One alternative approach to overcome this drawback is the non-gel-based liquid chromatography mass spectrometry shotgun proteomic technology [15-22]. It provides a powerful tool to resolve and identify thousands of proteins from a complex biological sample. This approach is rapid and more sensitive, and it usually increases the protein dynamic range 4- to 5-fold as compared to 2DE [23]. The biggest advantage of this method is that it can be automated and has capability for large-scale proteome analysis. Although some successes using isotopic labeling technology for protein quantification have been reported [24], it remains technically difficult to comprehensively characterize the global proteome due to the high costs of the labeling reagents and the nature of the methodology, e.g., proteins without certain amino acid residues cannot be labeled. In the past few years, the ion intensity-based or spectral counting-based label-free quantitative approaches have gradually gained their popularity in this regard, concomitant with significant improvements in mass spectrometer performance and bioinformatics [22]. This platform has become the platform of choice for many unbiased biomarker discovery studies today [25,26].
In this study, we applied an ion intensity-based label-free protein quantification technology [27,28] to analyze global protein expression profiles of plasma cells from MM patients and healthy controls. This method is high-throughput and sensitive enough to detect and quantify thousands of proteins in complex biological samples, allowing potential MM protein biomarkers with high discriminate ability to be identified.
Chemicals and reagents
Urea (99.5%), dithiothreitol (DTT), iodoacetamide, acetonitrile, and ammonium bicarbonate were all purchased from Sigma-Aldrich (St. Louis, MO, USA). Modified trypsin was purchased from Promega (Madison, WI, USA). Heat-inactivated Fetal Bovine Serum Premium was purchased from Atlanta Biologicals (Lawrenceville, GA, USA).
Plasma cells
The integrity of proteomic data relies on the purity of the plasma cells in the input samples. In this study bone marrow mononuclear cells (BMNC) were isolated from the bone marrow aspirates by Ficoll gradient centrifugation. Plasma cells were isolated from BMNCs based on the cell surface antigen expression, either by flow cytometric analysis or Magnetic-Activated Cell Sorting (MACS) after staining with magnetic conjugated antibodies. However, purity of the isolated cells depends largely on the prevalence of population of interest in the sample. In addition, CD138, universal marker for plasma cells, was unstable and lost during sample processing. Short analysis time and short isolation time are required for isolating plasma cells. We compared three different methods for plasma cell isolation for processing time, purity and yield. Reported data are based on plasma cells isolated in the same manner. All pallets were frozen at -80°C until all samples were collected.
1. CD138+ MACS: BMNCs were incubated with a monoclonal mouse anti-human CD138+ antibody immunomagnetic microbeads (Miltenyi Biotech, Bergisch Gladbach, Germany), washed using PBS containing 2% bovine serum albumin and 1 mmol/L EDTA (bead buffer) and loaded into the magnetic column cell separator. After three washing, cells were eluded from the column and checked for purity by flow cytometry using CD45-FITC (Becton Dickinson, Franklin Lakes, NJ, USA) and CD38-Cy5 (Becton Dickinson) and CD138-PE (Miltenyi) to identify CD45-/dim38++CD138+ of MM cells and CD45-/dim38-CD138+ of normal plasma cells.
2. Multicolor flow cytometry sorting: Using a combination of antibodies against CD45, CD38, CD138, CD19, CD56, conjugated with FITC, PE, PerCP-Cy5.5, PE-CY7, and APC, BMNCs were stained with the above antibody cocktail. These 5 colors allow exclusion of other hematopoietic cells including mature B cells. Normal plasma cells are CD45-, CD19+, CD138+, CD38-, CD56-, while MM plasma cells are CD45-, CD19-, D138+, CD38++, CD56+.
3. Two-stage approach: Negative MAC sorting to enrich for B lineage cells and to eliminate other cell types, followed by flow cytometric sorting for plasma cells. BMNCs were incubated with cocktail of biotin-conjugated monoclonal antibodies against CD2, CD14, CD16, CD36, CD43, and CD235a (Glycophorin A), then passed through the magnetic column. Flow-through cells were enriched for B cells. They were stained with CD38-Cy5 (Becton Dickinson) and CD138-PE (Miltenyi), and sorted.
We have consistently found that for samples with initial infiltration of plasma cells ≥ 10%, “CD138+ MACS” is the quickest and most convenient method, yielding consistently high plasma cell purity, but have the lowest retrieval rate as some plasma cells were present in the flow-through. This method is not usable for samples with low plasma cell abundance including all marrow aspirates from normal donors. “Multicolor flow cytometry sorting” gives the best purity for samples with less than 20% of plasma cells but take the longest time to sort and exclude cell populations that were not of interest. Viability of plasma cells in sorting experiments of more than 4-5 hours was poor. “Two-stage approach” was chosen for plasma cells isolation from both myeloma and normal bone marrows as it balances cell purity with processing time and therefore cell viability. All pallets were frozen at -80°C until all samples were collected.
Patient and healthy donor demographics
The study was conducted in compliance with the Declaration of Helsinki and with an approval of the Indiana University School of Medicine Institution Review Board. Patients with newly diagnosed MM who underwent bone marrow aspiration and biopsy for their routine care consented to donate an additional 5 ml of bone marrow aspirates for the study. Healthy volunteers signed up for the study voluntarily and donated up to 15 ml of bone marrow aspirates. Each patient and normal donors donate once for the study. There were no biological replicates. Their demographics were noted in Table 1.
Sample ID | Age | Gender | Type | Disease |
---|---|---|---|---|
NL1 | 25 | Male | Control | |
NL2 | 30 | Male | Control | |
NL3 | 40 | Female | Control | |
NL5 | 29 | Male | Control | |
NL6 | 35 | Female | Control | |
NL7 | 28 | Female | Control | |
NL8 | 32 | Male | Control | |
PT1002 | 65 | Male | MM Patient | IgG MM at diagnosis |
PT1004 | 68 | Male | MM Patient | IgA MM at relapse |
PT1009 | 56 | Female | MM Patient | IgG MM at relapse |
PT1011 | 72 | Female | MM Patient | IgG MM at relapse |
PT1012 | 49 | Male | MM Patient | IgG MM at diagnosis |
PT1014 | 57 | Male | MM Patient | IgG MM at diagnosis |
PT1015 | 66 | Female | MM Patient | Light chain MM at diagnosis |
PT1021 | 70 | Male | MM Patient | IgG MM stage at diagnosis |
PT1022 | 62 | Female | MM Patient | IgG MM at relapse |
PT1023 | 56 | Female | MM Patient | Light chain MM at diagnosis |
PT1025 | 63 | Male | MM Patient | Light chain MM at relapse |
PT1028 | 65 | Male | MM Patient | IgG MM at relapse |
PT1029 | 50 | Male | MM Patient | IgG MM at diagnosis |
Table 1: Summary of sample information.
Sample preparation for mass spectrometric analysis
As previously described [29], protein extraction from plasma cells was carried out in lysis buffer containing 8 M urea and 10 mM dithiothreitol (DTT). Bradford assay was performed to determine protein concentrations [30]. Triethylphosphine and iodoethanol were used to reduce and alkylate resulting protein extracts, respectively [31]. Protein mixtures were digested with modified trypsin and filtered through spin filters (0.45 μm) before being applied to the highperformance liquid chromatography (HPLC) system. Stability of the HPLC system and mass spectrometry (MS) instrument was evaluated by spiking a constant amount of chicken lysozyme as an internal reference for quality assurance and quality control (QA/QC) prior to tryptic digestion of protein extracts.
Liquid chromatography-tandem mass spectrometry (LC/MS/MS)
In random order, tryptic peptides (~2 μg) were injected onto an Agilent 1100 nano-HPLC system (Agilent Technologies, Inc., Santa Clara, CA, USA) equipped with a C18 capillary column (i.d. = 75μm , length = 5cm, pore size = 3μm, particle size = 100 Å). Peptides were eluted with a linear gradient from 5 to 45% acetonitrile developed at a flow rate of 500 nL/min over 120 min. Effluent was electro-sprayed into a LTQ mass spectrometer (Thermo-Fisher Scientific, Inc., Waltham, MA, USA). Data collection was performed in the “Triple-Play” mode (MS scan, Zoom scan, and MS/MS scan). Acquired data was filtered and analyzed by a previously published proprietary algorithm developed by Higgs et al. [27,28].
Protein identification, quantification, and statistical analysis
Protein database searches against the International Protein Index (IPI) human database (v3.60) and the NCBI Non-redundant-homo sapiens database (updated in January 2017) were carried out by both the SEQUEST (Thermo-Fisher Scientific, Waltham, MA, USA) and X!Tandem (an open-source software available from the Global Proteome Machine Organization, http://www.thegpm.org) database searching algorithms. Identified proteins were categorized into four priority groups based on the quality of the peptide identification and the number of unique peptides identified [32]. All the proteins were identified with at least one best peptide identified at a confidence level ≥ 90% (q-value≤ 0.1, q-value represents a false-discovery-rate or FDR which was described previously [33,34] or higher. Proteins were assigned to Priority 1 if two or more unique peptides were identified or Priority 2 if only a single peptide was identified. Peptides assigned to proteins with a confidence level of less than 90% but greater than 75% were assigned to Priority 3 (with >2 unique peptides) and Priority 4 (with a single peptide), respectively. Peptides with peptide ID confidence <75% were filtered out of this study. The estimation of the confidence levels, which is based on a random forest recursive partition supervised learning algorithm was described previously [28].
Protein quantification was carried out using a proprietary protein quantification algorithm licensed from Eli Lilly & Company (Indianapolis, IN, USA) as described previously [27,28]. Briefly, once the raw files were acquired from the mass spectrometer, all extracted ion chromatograms (XICs) were aligned by retention time. To be used in the protein quantification procedure, each aligned peak must match the parent ion, charge state, fragment ions (MS/MS data), and retention time (within a 1-min window). After alignment, the area-under- the-curve (AUC) for each individually aligned peak from each sample was measured, quantile normalized [35], and compared for relative abundance. All peak intensities were transformed to a log2 scale before quantile normalization. Quantile normalization was employed to ensure that every sample has a peptide intensity histogram of the same scale, location, and shape. This normalization removes trends introduced by technical variations including sample handling, sample preparation, total protein differences, and changes in instrument sensitivity while running multiple samples [35]. If multiple peptides have the same protein identification, then their quantile normalized log2 intensities were averaged to obtain log2 protein intensities. The log2 protein intensity is the final quantity that is fit by a separate ANOVA statistical model for each protein:
Log2(Intensity) = Group + Sample(Group)
Sample(Group) is a random effect. Group effect refers to the effect caused by the experimental conditions or treatments being evaluated. Sample effect represents the random effects from individual biological samples. It also includes random effects from sample preparation. All of the injections were randomized, and the same person operated the instrument for all samples in this study. The inverse log2 of each sample’s mean was calculated to determine the fold change between groups.
Pathway analysis
All priority 1 proteins with significant differential expression (q<0.05) were considered for further characterization by pathway analysis. Identified proteins were classified according to biological function(s) by Ingenuity Pathway Analysis software (https://analysis.ingenuity.com). Statistical analyses were performed using JMP software (SAS Institute, Inc., Cary, NC). A p-value <0.05 was considered significant.
Multiple-Reaction-Monitoring (MRM) Development for target validation
To validate some of these differentially expressed proteins, a mass spec-based MRM assay was developed. A different patient cohort was also used for this validation study. All MRM mass spectrometric analyses were performed on an AB SCIEX 4000 Qtrap hybrid triple-quadrupole linear ion-trap mass spectrometer (AB SCIEX, Framingham, MA, USA) interfaced with a Dionex UltiMate 3000 UHPLC system (Thermo- Fisher Scientific). Liquid chromatography (LC) was performed on a TSK-GELTM ODS-100V C18 column (Tosoh Bioscience, Tokyo, Japan, 1 mm i.d. × 50 mm, 3μm pore size). Peptides were eluted with a linear gradient from 8 to 25% acetonitrile developed over 50 min at a flow rate of 60 μL/min, and effluent was electro-sprayed into the 4000 Qtrap mass spectrometer. The source lenses were set by maximizing the ion current for the M+2H+ charge state of angiotensin. Chromatographic data acquisition was carried out using Analyst 1.5 (AB SCIEX), and peak integration and quantification were carried out using Skyline 1.2.1 (created by the MacCoss Lab, University of Washington). The selected proteins and their corresponding MRM peptides are shown in Table 2. Stable-isotope labeled internal standards of each selected MRM peptide were spiked in before digestion for quantification purpose. We also monitored three transitions for a spiked external standard (‘GYSLGNWVCAAK’ of chicken lysozyme): m/z 656.82 (M+2H+) → m/z 436.22, m/z 656.82 (M+2H+) → m/z 892.43 and m/z 656.82 (M+2H+) → m/z 1092.55 for QA/QC purpose.
Protein | Annotation | MRM Peptide | z | Avg. Mass | MRM Transition 1 | MRM Transition 2 | MRM Transition 3 | MRM Transition 4 | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Experimentally observed | ||||||||||||
IPI00012048.1 | Isoform 1 of nucleoside diphosphate kinase A | NIIHGSDSVESAEK | 2 | 743.79 | y11++ | 573.26 | y10+ | 1008.45 | ||||
IPI00216691.5 | Profilin-1 | CYEMASHLR | 2 | 577.66 | y7+ | 843.41 | y6+ | 714.37 | ||||
IPI00017855.1 | Aconitate hydratase, mitochondrial | LNRPLTLSEK | 2 | 586.20 | b9+ | 1024.58 | b7+ | 808.50 | ||||
IPI00021266.1 | 60S ribosomal protein L23a | LAPDYDALDVANK | 2 | 703.28 | y11++ | 610.79 | ||||||
IPI00006935.3 | Eukaryotic translation initiation factor 5A-2 | KYEDICPSTHNMDVPNIK | 3 | 717.14 | y17++ | 1010.46 | y5+ | 570.36 | ||||
IPI00298308.7 | Isoform 1 of probable 10-formyltetrahydrofolate dehydrogenase ALDH1L2 | ANSTEYGLASGVFTR | 2 | 787.36 | y9+, b9+ |
907.50, 907.42 |
b6+, y6+ |
666.27, 666.36 |
y7+ | 737.39 | y5+ | 579.33 |
IPI00297084.7 | Dolichyl diphosphooligosaccharide protein glycosyltransferase 48kDa subunit | TLVLLDNLNVR | 2 | 635.76 | y6+ | 730.38 | y8+ | 956.55 | y3+ | 388.23 | ||
IPI00220301.5 | Peroxiredoxin-6 | LPFPIIDDR | 2 | 543.64 | y6+ | 728.39 | y2+ | 290.15 | ||||
IPI00019502.3 | Isoform 1 of Myosin-9 | DFSALESQLQDTQELLQEENR | 2 | 1247.82 | y5+ | 675.31 | y6+ | 788.39 | b16+ | 1818.88 |
Table 2: Selected target proteins, MRM peptides and transitions used for the MRM assay development.
Plasma cell isolation from MM and Normal volunteer bone marrows
We found that the purity of isolated plasma cells depends largely on the abundance of plasma cells in the samples. For samples with higher than 20% of plasma cells, the positive MACS yields >95% plasma cells consistently, while taking the least processing time. However, when plasma cells in the bone marrow aspirates are less than 20%, purity of the positive MACS decreased dramatically and there was a significant plasma cell loss in the flow-through. This was a primary problem for normal bone marrow which usually contains less than 1% plasma cells.
Unbiased large-scale global proteomic study
To characterize the alterations in protein expression related to multiple myeloma phenotypes, we performed a label-free LC/MS-based quantitative proteomic analysis of the sorted plasma cells from healthy controls and MM patients. The sample information in each group is summarized in Table 1. Proteins identified based on priority groups [27] are summarized in Table 3. A total of 776 proteins were identified and quantified with high confidence (Priority groups 1 & 2) in the samples. The expression levels of 18 proteins from Priority Group 1 and 90 proteins from Priority Group 2 were statistically significantly changed. Among 18 significantly changed proteins from the Priority Group 1, 9 were up-regulated (Table 4) and 9 were down-regulated (Table 5). These 18 proteins were further analyzed by pathway analysis for their roles in biological processes. The overall results from the study are illustrated in Figure 1. When all four priority groups are considered, there are more down-regulated proteins than up-regulated ones.
Protein Priority |
Peptide ID Confidence |
Multiple Sequences |
Number of Proteins Identified |
Number of Significant Changes |
Maximum Absolute Fold-change |
Median %CV |
---|---|---|---|---|---|---|
1 | High (>90%) | Yes | 372 | 18 | 2.43 | 20.35 |
2 | High (>90%) | No | 404 | 90 | 16.54 | 38.39 |
3 | Moderate (75~90%) | Yes | 20 | 3 | 5.23 | 33.26 |
4 | Moderate (75~90%) | No | 520 | 58 | 10.67 | 47.03 |
Overall | 1316 | 169 | 16.54 | 34.35 |
Table 3: Overall summary of the study.
Protein ID (IPI) | Annotation | Fold-change |
---|---|---|
IPI00017855.1 | Aconitate hydratase, mitochondrial | 1.49 |
IPI00909879.1 | cDNA FLJ50886, highly similar to aconitate hydratase, mitochondrial | 1.58 |
IPI00012048.1 | Isoform 1 of nucleoside diphosphate kinase A | 1.78 |
IPI00026260.1 | Isoform 1 of nucleoside diphosphate kinase B | 1.61 |
IPI00419373.1 | Isoform 1 of heterogeneous nuclear ribonucleoprotein A3 | 1.64 |
IPI00216691.5 | Profilin-1 | 1.65 |
IPI00452747.6 | Similar to Signal peptidase complex subunit 2 | 1.69 |
IPI00021266.1 | 60S ribosomal protein L23a | 2.02 |
IPI00006935.3 | Eukaryotic translation initiation factor 5A-2 | 2.11 |
Table 4: Up-regulated proteins in priority 1 group with FDR<5%.
Protein ID (IPI) | Annotation | Fold-change |
---|---|---|
IPI00298308.7 | Aldehyde dehydrogenase 1 family member L2 (ALDH1L2) | -2.01 |
IPI00219291.5 | Isoform 2 of ATP synthase subunit F, mitochondrial | -1.99 |
IPI00013895.1 | Protein S100-A11 | -1.79 |
IPI00171903.2 | Isoform 1 of Heterogeneous nuclear ribonucleoprotein M | -1.74 |
IPI00010471.5 | Plastin-2 | -1.74 |
IPI00007188.5 | ADP/ATP translocase 2 | -1.70 |
IPI00297084.7 | Dolichyl-diphosphooligosaccharide - protein glycosyltransferase 48 kDa subunit | -1.62 |
IPI00220301.5 | Peroxiredoxin-6 | -1.46 |
IPI00019502.3 | Isoform 1 of myosin-9 | -1.44 |
Table 5: Down-regulated proteins in priority 1 Group with FDR<5%.
For quality assurance and quality control (QA/QC) purpose, chicken lysozyme was spiked into every individual sample at a constant amount (5 ng chicken lysozyme per 2 μg of testing sample) before tryptic digestion. There were nine unique chicken lysozyme peptides being detected and quantified. After averaging these peptide concentration values, a 1.082 fold-change was observed with a q-value (FDR) of 0.77, suggesting this observed small change (8.2% overexpression) is not statistically significant and thus the data obtained from this study was reliable.
MRM assays
To confirm some of the observed protein expression changes from the global biomarker discovery experiment, a multiple-reaction-monitoring (MRM)-based targeted proteomic assay was developed and a selected panel of targets were quantitatively validated by MRM. Table 6 shows the results from the targeted MRM assays.
Protein | Annotation | Observed Fold-Change (FC) |
---|---|---|
IPI00012048.1 | Isoform 1 of nucleoside diphosphate kinase A | 1.91 |
IPI00216691.5 | Profilin-1 | 1.75 |
IPI00017855.1 | Aconitate hydratase, mitochondrial | 1.61 |
IPI00021266.1 | 60S Ribosomal protein L23a | 2.32 |
IPI00006935.3 | Eukaryotic translation initiation factor 5A-2 | 2.12 |
IPI00298308.7 | Isoform 1 of probable 10-formyltetrahydrofolate dehydrogenase (ALDH1L2) | -2.17 |
IPI00297084.7 | Dolichyl-diphosphooligosaccharide – protein glycosyltransferase 48 kDa subunit | -1.62 |
IPI00220301.5 | Peroxiredoxin-6 | -1.81 |
IPI00019502.3 | Isoform 1 of myosin-9 | -1.57 |
Table 6: Target proteins detected by MRM from the clinical samples (“-“ indicates down-regulation).
Statistical motivation
The size of the treatment or disease effect (signal) needs to be evaluated relative to the sample and replicate variation (noise). The signal to noise ratio is estimated based on a statistical model. If the data have multiple sources of random variation such as biological samples and replicates then the data are modeled as a Linear Mixed Model (A generalization of an ANOVA, Analysis of Variance) [36]. This kind of model, especially when applied to complex experimental designs, cannot be handled by introductory methods such as t-tests. The exact scale of the protein expression used in the model can make a difference in the sensitivity. There is usually a large technical variation introduced by the act of ‘measurement’ in any ‘omics’ study. Randomization of measurement order will eliminate the bias but it is still extremely important to ‘normalize’ or mathematically calibrate the measurement. This is a highly technical matter but can be viewed as similar to mathematically resetting a scale to zero before each measurement. We use a statistically based method called ‘quantile normalization’ [35] which was the result of considerable research on genomic data. Because ‘omics’ measures of expression are usually on an arbitrary scale, it is best to evaluate ratios or their equivalent differences on the log scale. Log base 2 is chosen because a unit difference on the log scale is equivalent to a two-fold change.
Up-regulated biomarker candidate proteins
The nine MM biomarker candidate proteins were identified with high confidence. All of these proteins were found to play some roles in cancer, which lends support to these proteins potentially being viable biomarkers of MM, especially as prognostic biomarkers, i.e., treatment responses.
The first of these nine proteins is mitochondrial aconitate hydratase, which is a mitochondrial TCA-cycle enzyme catalyzing the reaction of reversible isomerization of citrate to isocitrate [37]. It is very sensitive to reactive oxygen species (ROS) [38] and plays a key role in the malignant transformation of the prostate [39]. The blocking of its mRNA expression causes a decrease in ATP biosynthesis, increase in citrate secretion, and reduction of the rate of proliferation of human prostate carcinoma cells [40]. Thus it is reasonable to assume that overexpression of mitochondrial aconitate hydratase increases cell proliferation. Evidence also supports a broad role for the p53 gene in regulating its expression and prostatic tumorigenesis [41].
The second protein is an uncharacterized protein cDNA FLJ50886, which is highly similar to mitochondrial aconitate hydratase. However, its function remains to be discovered.
The third and fourth proteins are isoform 1 of nucleoside diphosphate kinase (NDK) A and B. NDK exists as a hexamer composed of 'A' (encoded by NME1) and 'B' (encoded by NME2) isoforms. Multiple alternatively spliced transcript variants encoding the same isoform have been found for this gene. The NME family of genes encodes highly conserved (~78% amino acid identity) multifunctional proteins that have been shown to participate in nucleic acid metabolism, energy homeostasis, cell signaling, and cancer progression [42,43]. Some family members, particularly isoforms 1 and 2, are the most closely related and are the ones most implicated in tumor progression [42]. They are also evolutionarily highly conserved.
Indeed, the Drosophila AWD lethal phenotype can be rescued by exogenously expressed human NME2 [44]. Unfortunately, there have been few consensus mechanistic explanations for this critical function because of the numerous molecular functions ascribed to these proteins, including nucleoside diphosphate kinase, protein kinase, nuclease, transcription factor, growth factor, among others [45]. At present, it is not yet clear what molecular activity of NME is involved in multiple myeloma and such information will be of ultimate importance for the NME studies in general.
The fifth protein, isoform 1 of heterogeneous nuclear ribonucleoprotein A3 (hnRNP A3), is a relatively less known protein compared to the best known members of the hnRNP family (i.e., A1, A2/B1). The hnRNP proteins have a major nucleoplasmic localization with several of them (A, D, E, I, K, L) capable of nucleo-cytoplasmic shuttling, while others (C and U) do not exit the nucleus except under certain cellular conditions [46,47]. In addition to the major nuclear role of hnRNPs in mRNA processing (splicing, polyadenylation) and transport, they are known to participate in several other events, including transcription, DNA repair and telomere DNA formation [48,49]. The function of hnRNP A3 was also suggested by Ma et al. [50] and Papadopoulou et al. [51] to be involved in many aspects of mRNA maturation processes including the hnRNPs/mRNA interactions. However, the role of the overexpression of hnRNP A3 in MM plasma cells is still unclear.
The sixth protein, profilin-1, coded by the PFN1 gene, is a ubiquitous actin-binding protein regulating actin polymerization in response to extracellular signals. Deletion of the PFN1 gene is associated with Miller-Dieker syndrome [52]. It also plays a role in Huntington disease [53]. Recently published proteomics data have demonstrated that profilin-1 could be used as a biomarker for breast cancer prognosis [54,55], although further validation is required before it can be used to predict treatment response to tamoxifen in breast cancer patients. It was described three decades ago that profilin-1 is involved in the negative regulation of carcinoma cell motility [56]. It is also associated with other cancers [57].
The seventh protein, similar to signal peptidase complex subunit 2 (SPCS2), is a serine protease that cleaves signal peptides from translocated precursor proteins [58]. However, its role in cancer or other diseases remains to be uncovered.
The eighth protein, 60S ribosomal protein L23a (RPL23A), together with a small 40S ribosomal protein, forms the ribosome complexes that catalyze protein synthesis [59]. In human, this protein belongs to the L23P family of ribosomal proteins and may be one of the targets involved in mediating growth inhibition by interferon [60]. It was recently shown by Sun et al. that RPL23A exhibited anti-cancer function on the Hep-2 cells [61]. More research is required to determine the anticancer activities of this protein.
The final protein in this group, eukaryotic translation initiation factor 5A-2 (eIF5A-2), is a well characterized protein involved in the regulation of cell proliferation and apoptosis [62]. Overexpression of eIF5A-2 promotes colorectal carcinoma cell aggressiveness by up-regulating MTA1 through c-Myc to induce epithelial mesenchymal transition [63] and enhances cell motility and metastasis [64]. eIF5A-2 is an adverse prognostic marker of survival in stage I non-small cell lung cancer patients [65]. It was also recently suggested that eIF5A-2 may serve as a new molecular diagnostic or prognostic marker or as a molecular target for anti-cancer therapy [66,67].
Down-regulated priority 1 proteins
The nine significantly down-regulated proteins in the Priority Group 1 are aldehyde dehydrogenase 1 family member L2 (ALDH1L2), isoform 2 of mitochondrial ATP synthase subunit F, protein S100-A11, isoform 1 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), plastin-2, ADP/ATP translocase 2, dolichyl-diphosphooligosaccharide-protein glycosyltransferase 48 kDa subunit, peroxiredoxin-6, and isoform 1 of myosin-9 (Table 4). Among these proteins, protein S100-A11, hnRNP M, peroxiredoxin-6, and isoform 1 of myosin-9 are of particular interest because they have been implicated in cancer progression.
Protein S100-A11, also called calgizarrin, was found to be differentially expressed in UV-treated HeLa cells [68], in human head-and- neck squamous cell carcinomas (HNSCCs) [69], and in colorectal carcinoma [70]. Although the precise role of S100-A11 protein in carcinogenesis is poorly understood, it seems that formation of homo- and hetero-dimers, binding of Ca2+, and interaction with effector molecules are essential for the development and progression of many cancers [71,72]. Several studies have suggested that S100 proteins promote cancer progression and metastasis through cell survival and apoptosis pathways [73-75].
hnRNP M plays role in mediating metastasis and the inflammatory response [76,77], while under-expression of peroxidoxin-6 enhances the susceptibility of cells to tumorigenesis [78]. Isoform 1 of myosin-9 is a known binding protein that binds to a number of proteins related to cancer progression and the unconventional secretory pathway [79].
Study limitations
This work demonstrates the use of large-scale quantitative mass spectrometry for rapid identification of biomarker candidates which could have important clinical value once further validated. More detailed studies, especially on larger sample size, will allow sufficient power to discern the importance of these biomarkers. We also highlight the technical difficulties of cellular biomarkers, particularly for studies of low-abundance plasma cells. We demonstrate dire need for method standardization. Our methodology could benefit analysis of bone marrow samples with low plasma cell abundance such as samples after therapy to determine important markers for disease response or relapse.
Our study has technical limitation in the age difference between the healthy control and patients. Age-matched healthy controls are difficult to find as myeloma is a disease of the elderly. Ways to obtain normal age-matched population may be to isolate normal plasma cells from our patients who are in remission after stem cell transplantation but exposure to treatment and possible contamination of remaining small number of myeloma cells are confounding factors. Alternatively, normal plasma cells can be isolated from bone marrows of patients undergoing bone marrow aspirates for other reasons unrelated to myeloma, such as anemia. However, the underlying health reasons that mandate the procedure are also confounding factors. In addition, additional bone marrow aspirates required for research may not be safe to obtain from patients with significant anemia. Ultimately, the best validation strategy may be to directly evaluate these proteins in a larger and independent patient cohort.
We cannot exclude the impact of aging on expression of these proteins. Literature search did not identify an alteration of these proteins with aging in general and within the lymphocyte system. Again, a validation in a larger patient cohort may allow stratifying patients into different age groups. A different expression in younger patients compared to older patients i.e. less than 65 years of age and older may give a clue to the age effect on protein expression. Mechanistic studies of these proteins in myeloma biology require genetic knockdown models or specific inhibitors where applicable.
With increased interest in biomarker research, the data and method reported in this unbiased global proteomic study may serve as a good example for clinical biomarker discovery. Using a panel of candidate biomarkers will most likely enhance the selectivity and specificity of a diagnostic and/or prognostic assay. The time has come to explore protein biomarkers as tools to further enhance our understanding of multiple myeloma and potentially improved patient care.
The authors acknowledge the Indiana University School of Medicine Proteomics Core for accessing the mass spectrometers and the Linux cluster. The work was funded by the Multiple Myeloma Research Foundation (MMRF). Attaya Suvannasankha and Mu Wang were recipients of MMRF Collaborative Proteomics Grant Award. In addition, VA merit award partly covers effort of Attaya Suvannasankha and Colin D. Crean.