ISSN: 0974-276X
Research Article - (2008) Volume 1, Issue 7
Protein splicing is a post-translational process, in which a nested intervening sequence (intein) is spliced out of the interior of a polypeptide precursor, and the flanking protein fragments (exteins) are ligated to form a mature protein. This process was identified in yeast, bacteria and the plant jackbean, and recently for MHC class I antigen processing in vertebrates. Thus, it seems very likely that, besides antigens, functional proteins could be synthesized by post-translational splicing in vertebrates. Protein splicing indicates that proteins, after their translation, can evolve and change/exchange their sequences. The availability of natural mechanisms of protein splicing leads to the assumption that such and/or similar mechanisms might exist enabling in vivo polypeptide rearrangement in a large scale and in a diverse manner. Thus, I propose here a polypeptide rearrangement hypothesis that describes the generation of new proteins (mosaic proteins) through exchange or reorganization of defined polypeptide sequences (modules) between or within translated proteins. The implications of this hypothesis in genetic diversity, protein antigenic properties and diseases are discussed.
Keywords: Protein splicing; Intein; Extein; Protein Rearrangement; Genetic diversity; Expression profile; Mosaic proteins; Post-translational modification; MHC; Autoantigen; Neo-autoantigen
The major histocompatibility complex (MHC) was originally discovered because of its role in the rejection of transplants made between incompatible individuals (Klein, 1986). MHC class I molecules (MHC-I), encoded by genes in MHC locus, are constitutively expressed at almost all nucleated cells and present peptides of usually eight or nine amino acids in length to CD8-expressing cytotoxic T lymphocytes (CTLs). Most peptides that bind with class I molecules are derived from proteins synthesized in the cytosol and degraded by the proteasome. The recognition of non-self or"anomalous" antigenic peptides leads to the activation of the CTL, which lyse the antigen presenting cell (Flutter and Gao, 2004). However studies of MHC and antigen processing and presentation have revealed not only the mechanisms underlying principles of self/non-self discrimination by the immune system, but they have also advanced our knowledge about several genetic and cellular processes like gene polymorphisms, regulation of gene expression, gene evolution, protein trafficking and transport, post-translational modification of proteins, and the function of chaperones and proteasome/proteases. Furthermore the MHC, and particularly the diversity of MHC genes, has had stimulating effect on biologists leading to the generation of new hypotheses (see for example Klein, 1987; Müllbacher, 1997; Martinsohn et al., 1999). Thus studies of MHC have improved our biological knowledge and provided a source for inspiration and imagination. For these reasons the former designation of MHC as a supergene by George Snell (Snell, 1981) seems very appropriate.
Recent studies on this "supergene" demonstrate that not all antigens that are presented by MHC class I molecules (MHC-I) represent a continuous peptide sequence of a degraded protein (Hanada et al., 2004; Vigneron et al., 2004). Two different peptide sequences can be joined to generate a suitable MHC-I antigen, as it is already shown by mRNA splicing of exons. Furthermore, Van den Eynde and his colleagues showed that the non-contiguous peptides can be spliced together in the reverse order (Warren et al., 2006). Once more, new fascinating finding about MHC compels us to reconsider our current ideas about fundamental genetic processes and puts forward a new hypothesis.
Protein Splicing
Protein splicing is a post-translational process, in which a nested intervening sequence (intein) is spliced out of the interior of a polypeptide precursor, and the flanking protein fragments (exteins) are ligated to form a mature protein. This process was identified in yeast, bacteria and the plant jackbean (Wallage, 1993), and recently for antigen processing in vertebrates (Hanada et al., 2004; Vigneron et al., 2004; Hanada and Yang, 2005; Warren et al., 2006). Thus, it seems very likely that, besides antigens, functional proteins could be synthesized by post-translational splicing in vertebrates.
So far, most of the described cases for naturally occurring protein splicing is cis-splicing, where the excised intein and ligated exteins belong to the same polypeptide chain that is encoded by a single gene.
Interestingly, recent works on prokaryotic proteins (such as in cyanobacteria and the archeaon Nanoarchaeum equitans) showed that proteins can also be generated by trans-splicing of two polypeptides encoded by two different genes (Wu et al., 1998; Evans et al., 2000; Liu and Yang, 2003; Choi et al., 2006; Dassa et al., 2007). The products of the two genes consist of an N-extein (N-terminal extein) followed by an intein sequence and an intein followed by a C-Extein (C-terminal extein), respectively. These so-called split inteins interact to form a functional intein and catalyze protein splicing activity in trans. Thus, in this process, complementation of the intein fragments precedes the splicing reaction.
The mechanism of protein splicing typically consists of four steps (see Figure 1):
Figure 1: A general autocatalytic model of protein splicing (modified according to Saleh and Perler, 2006; Hall et al., 1997; Shao et al., 1996). The first step involves a nucleophilic attack by the hydroxyl group of (here) a serine (at the amino end of the intein) on the carbonyl group of the peptide bond of the preceding residue (at the carboxyl end of the N-extein), resulting in an N-0 acyl shift. Second step: the ester linkage thus formed is then broken down via a nucleophilic attack by the hydroxyl group of a serine (the first residue of the C-extein). This leads to a transesterification and formation of a branched protein intermediate. The nucleophilic attacks, which can also be initiated by threonine or cysteine, are facilitated by a preceding deprotonation of the hydroxyl/thiol group by e.g. a base (B) or simply water. The branched protein intermediate resolves to a free intein and ligated exteins by cyclization of the C-terminal residue (usually an asparagine) of the intein into a succinimide (step 3). Finally, the ester bond linking the exteins is spontaneously shifted into a peptide bond (O/N acyl shift) and the succinimide ring at the intein carboxy end is slowly hydrolyzed to regenerate asparagine or isoasparagine. N-extein: N-terminal extein. C-extein: C-terminal extein.
The first step is a reversible shift of the peptide bond between the amino end of the intein and its amino terminal flank (N-extein) into an ester or thioester bond. This N-O or N-S acyl shift is formed upon a nucleophilic attack of the bond by the side-chain (-OH or -SH) of the serine, threonine or cysteine residues at the amino terminal end of the intein.
In the second step the ester (or thioester) bond is nucleophilically attacked by the side-chain (-OH or -SH) of the first residue in the C-extein, which is usually serine, threonine or cysteine. This leads to a transesterification and formation of a branched intermediate with two amino ends, one of the N-extein and one of the intein. The intein is joined by peptide bond to the C-extein and the two exteins are joined by an ester or thioester bond.
In the third step the intein is cleaved away by cyclization of its C-terminal residue (usually an asparagine or glutamine) into a succinimide or glutarimide ring resulting in the excision of the intein and the release of the exteins linked by an ester or thioester bond.
The final step consists of spontaneous shift of the ester or thioester bond linking the exteins into a peptide bond (S/N or O/N acyl shift). The succinimide or glutarimide ring at the intein carboxy end is slowly hydrolyzed to regenerate asparagine or isoasparagine.
More details about the mechanism of protein splicing are described e.g. in Saleh and Perler, 2006; Hall et al., 1997; Shao et al., 1996.
In lower organisms, the intein is known to be a self-excising catalytic unit, whereas the antigen splicing in vertebrates is catalysed by the proteasome. However in both cases, no exogenous cofactors or energy sources are necessary. The energy released during peptide-bond hydrolysis is used to make a new peptide bond.
The polypeptide rearrangement hypothesis
Protein splicing (and in particularly trans-splicing) indicates that proteins, upon translation represent not always a self-contained unit. Rather, they can evolve and change/exchange their sequences. The availability of the above described natural mechanism of protein splicing leads to the assumption that such and/or similar mechanisms might exist enabling in vivo polypeptide rearrangement in a large scale and in a diverse manner. Thus, I propose here a polypeptide rearrangement hypothesis that describes the generation of new proteins/mosaic proteins through exchange or reorganization of defined polypeptide sequences, which are designated here as modules, between or within translated proteins. At least two scenarios for polypeptide recombination can be proposed (Figure 2): (i) After excision of an internal module (In-module) sequence from a protein, the released N-terminal and C-terminal modules and/or internal module sequences can be re-ligated e.g. in an inverse order to build a new protein (see figure 2.a); (ii) the modules can be mutually exchanged between two polypeptides (B and C) resulting in chimeric proteins (see figure 2.b).
Figure 2: Protein rearrangement hypothesis. At least two scenarios for polypeptide recombination can be proposed: (a) Excision of an In-module and relegation of the modules (Nt-, In- and Ct-modules) in altered/reverse order. (b) Upon excision of In-modules from two different proteins (B and C), the different modules can be mutually exchanged between the proteins (B and C) resulting in the release of chimeric proteins. Nt-module: N-terminal module; Ct-module: C-terminal module; In-module: internal module.
In the proposed rearrangement model "inteins" represent functional fragments that can be integrated in new formed mosaic proteins; therefore I prefer to use the term "module", meaning interchangeable protein fragments that can be assembled together to build a new functional protein.
During the recombination process, the protein modules act like transposons that jump from one polypeptide sequence or from one position to the other. This reorganisation would result in the inactivation or modulation of the function of the targeted protein.
Polypeptide rearrangements are chemically and mechanistically feasible
Polypeptide rearrangement events involve break and excision of peptide bonds, and the ligation of the generated protein fragments. The initial chemical mechanism leading to the break of peptide bonds that involves nucleophilic attack by e.g. a serine, threonine or cysteine residue on the adjacent peptide bond leading to an N-O or N-S acyl shift to yield a reactive ester or thioester bond (see figure 1), respectively, is used by various groups of intramolecular autoprocessing proteins. These include autocleavage of Hedgehog proteins, protein splicing, maturation of pyruvoyl-dependent enzymes, and Glycosylasparaginase Precursors (reviewed in Paulus, 2000).
Furthermore, the chemical mechanisms enabling the subsequent excision and ligation of protein fragments, which involve the transesterification and cyclization followed by S/O-N acyl rearrangements (see Figure 1), are also a well established fact, as it is shown e.g. for protein splicing (e.g. Wallage, 1993; Wu et al., 1998; Evans et al., 2000; Liu and Yang, 2003; Choi et al., 2006; Dassa et al., 2007) and for the covalent attachment of a cholesterol moiety to the aminoterminal fragment of Hedgehog protein (Hall et al., 1997). Furthermore, native peptide chemical ligation reactions are performed inside living cells (Camarero et al., 2001) and widely used for protein engineering (Muir et al., 1998; Hofmann and Muir, 2002; David et al., 2004).
Thus from a chemical view, enormous data in peptide/ protein chemistry indicates that the mechanisms (break of peptide bond; excision and ligation of protein fragments) allowing the proposed protein rearrangements can be readily realized. Nothing would prevent the hydrolysis of a peptide bond and the formation of a new bond, ones a crucial (deprotonated) side chain (e.g. hydroxyl- or thiol-group) is in a position suitable for a nucleophilic attack at an electrophilic side chain of another amino acid or at an electrophilic linkage (in the same protein or in another protein).
Consequently, it would be rather a surprise, if at least a part of the many interacting proteins in a cell would not undergo rearrangement events and exchange defined polypeptide sequences (modules).
If such rearrangement actually occurs, it is likely that it can be facilitated by the assistance of some specialized"recombinase" proteins and/or proteases. Protease-catalysed protein splicing could be shown e.g. for antigen generation (Hanada et al., 2004; Vigneron et al., 2004; Warren et al., 2006).
Furthermore, even an interaction of recombining proteins needs not always be necessary. The excised fragments could be maintained in a reactive state and brought together by"recombinase" proteins/proteases, which would allow the interaction and ligation of the fragments. A similar scenario is known e.g. for antigen processing, where peptides are excised from proteins by the proteasome in the cytosol and transported by the transporter associated with antigen processing (TAP) into the ER where they are loaded into MHC molecules, although the peptides are not covalently bound to the MHC molecules (Flutter and Gao, 2004).
The described chemical and mechanistical features of protein splicing immediately suggest possible mechanisms for the regulation of polypeptide rearrangement events and the selection only of some proteins/interacting proteins for rearrangement modifications.
Although key nucleophilic side chains (hydroxyl- or thiolgroup) of interacting proteins would be positioned in a suitable position for a nucleophilic attack, the proteins muss not always undergoes splicing. Through e.g. phosphorylation or oxidation of the crucial hydroxyl- or thiol-groups, respectively, their nucleophilic properties can be abrogated. Another possible regulation mechanism can involve conformation changes that result in the inaccessibility of the key nucleophilic and/or electrophilic side chains.
Approaches to screen for protein rearrangement events
Phenomena that could indicate protein rearrangement events are observed in proteomic analyses. The most widely used approaches to identify protein sequences combine: (i) the separation of proteins by two-dimensional gel electrophoresis, which produces a gel with spots corresponding to individual proteins; (ii) cutting out the spots and digestion of the proteins into shorter peptides by enzymes such as trypsin; and (iii) analysis of the peptide fragments by mass spectrometry to identify their mass (peptide-mass fingerprint). The 'peptide-mass fingerprint' provides some partial amino acid sequence of the protein spots, which are then compared with sequence data predicted from genome, expressed sequence tag (EST) or protein databases enabling the identification of the protein being examined (Figeys et al., 1998; Mann et al., 2001). However, the analysis of the peptide-mass fingerprints usually shows peptide sequences of a same protein in multiple gel spots; or peptide sequences of different proteins in one spot (e.g. Figeys et al., 1998; Santoni et al., 1998; Xia et al., 2008). For these spots, I will use the term mystifying spots.
Actually, the former phenomenon is explained by protein degradation or by the presence of isoforms and/or of posttranslational modifications, resulting in divergent migrations of the "same protein type" through the gel. The second phenomenon is justified by co-migration of different proteins in the same spot, or by protein contamination.
However, these phenomena can also probably reflect rearrangement events. The seemingly mismatched peptides could result from a same precursor "mosaic" protein. Polypeptide rearrangements would indeed lead to the identification of different spots matched with peptides from same protein and of peptides from different proteins in one spot. Thus a detailed analysis of the sequences from these mystifying spots would provide a useful approach to screen for protein rearrangement events. For that, before digestion and mass spectrometric analysis, a potential precursor"mosaic" protein has to be highly purified from a given mystifying spot to ensure the existence of only this one protein as a source for the generated and characterized peptide sequences.
Another phenomenon that could reflect polypeptide rearrangement events is the known antibody cross-reactivity. It is generally accepted that antibody cross reactivity is a consequence of the ability of an antibody to react with similar antigenic sites on different proteins. I suggest that some of the antibody cross-reactivities might reflect, at least to some extent, a consequence of rearranged epitopes. In such case, the antibody reacts with its specific epitope that is integrated by rearrangement events in diverse proteins rather than with similar antigenic sites in multiple proteins. Thus screening for protein rearrangement events could be achieved by sequencing the polypeptides recognized by a given antibody.
However, it will be a big challenge to identify the corresponding genes of a mosaic protein, last but not least because the gene products can be arranged in reverse order.
"Conventional" post-translational modifications contribute to functional diversity
All proteins are potentially subject to modifications during their lifetime. Post-translational modifications, such as acetylation, phosphorylation, ubiquitination, acylation, deamidation, methylation and glycosylation, represent mechanisms contributing to molecular and functional diversity (Seo and Lee, 2004).
For example, the T cell receptor (TCR) α and β chains together have seven N-glycans (Rudd et al., 1999). The TCR glycosylation is necessary for the assembly of these chains into a mature TCR complex, which is composed of TCR-α and TCR-β chains, CD3 and ζ -chain accessory molecules and the co-receptor CD4 or CD8. The deficiency in ß1,6 N-acetylglucosaminyltransferase V (Mgat5), an enzyme involved in the N-glycosylation pathway, enhances TCR clustering and signaling, as well as agonist induced proliferation (Demetriou et al., 2001). Thus the N-glycosylation of TCR negatively regulates T-cell activation. Furthermore N-linked glycosylation of native MHC-I molecules is required for antigen presentation and recognition by TCR (Bagriaçik et al., 1996; Rudd et al., 1999). N-linked glycosylation of CD1, a cell surface receptor related to MHC molecules, which can present non-peptidic lipid and glycolipid antigens to the TCR, protect the protein from endosomal proteases and play a major role in the organisation and spacing of CD1 on the cell surface by preventing non-specific aggregation (Rudd et al., 1999).
Further examples of the modulatory effect of glycosylation could be shown for the immunoglobulin G (IgG) activities. IgG mediates pro- and anti-inflammatory activities through the engagement of its Fc fragment (Fc) with distinct Fcg receptors (FcγRs). At high doses, intravenous IgG is widely used as an anti-inflammatory agent for the treatment of autoimmune diseases (Kaveri et al., 2008; Nimmerjahn and Ravetch, 2007). Glycosylation of IgG is essential for binding to all FcγRs (Jefferis and Lund, 2002). The anti-inflammatory properties of IgG are acquired upon Fc sialylation, which is reduced upon the induction of an antigen-specific immune response (Kaneko et al., 2006). This differential sialylation may provide a switch from anti-inflammatory to proinflammatory effects of IgG.
However such post-translational modifications neither modify the integrity of the amino acid sequences, which reflect the nucleotide sequence of the coding genes, nor do they lead to the generation of new proteins with completely different functions. These modifications act rather as regulator of the function, transport, interaction or stabilization of the proteins (e.g. reviewed in Seo and Lee, 2004).
In contrast to these "conventional" post-translational modifications, polypeptide rearrangement events use precursor proteins to provide new protein sequences (mosaic proteins), which are not genetically encoded in a conventional manner. Usually the production of new proteins requires e.g. the generation of new genes through the recombination of preexisting gene sequences. However these recombination mechanisms that generate genetic diversity could sometimes relocate to post-genomic level by using polypeptide rearrangements. Analog to gene recombination, polypeptide rearrangement recombines partial gene products at post-translational level to produce new polypeptide sequences resulting indirectly in genetic diversity.
The implication of the polypeptide rearrangement hypothesis in genetic diversity
The polypeptide rearrangement hypothesis will have long-range consequences for the generation of genetic diversity. In the following section I will discuss the genetic implications of this proposed hypothesis.
Since the discovery of the DNA double helix, the genetic information for the biosynthesis has been attributed exclusively to the DNA. However, recent evidence indicates that the RNA, not only contributes to generation of diversity through splicing, but also acts as an information carrier throughout generations (Rassoulzadegan et al., 2006; Pearson, 2006).
Now, polypeptide rearrangement would further yield intraand inter-lateral dissemination of genetic information at protein level. Thus, the genetic diversity need not be limited to instructions and mechanisms at the DNA and RNA level but also could be expanded and adapted (according to the cellular requirements) at the level of the protein. Thus, besides studying proteins in the context of physiological and structural functions, there is a need to address the mechanisms underlying the function of proteins as codesigners and producers of genetic diversity.
Antigen splicing shows another surprising phenomenon. Van den Eynde and his colleagues showed that non-contiguous peptides can be spliced together in the reverse order to generate an appropriate antigen (Warren et al., 2006). This leads to the following suggestions. First, the genetic information could in some cases only make sense when it gets the correct orientation at the polypeptide level. Second, it opens up the possibility that the genetic and functional diversity can be increased by using a same sequence in different orientations. Thus, our present onedimensional view about the transfer of genetic information from genes to proteins is going to be fluctuating. The symmetrical transfer of the genetic information is not more universally valid. The polypeptide rearrangement hypothesis opens up the possibility that two or more genes could in principle encode for a mosaic protein. In other words, the genetic information for a "mosaic" protein A can be split across the genome and the necessary polypeptide fragments can be picked out from the products of these genes and spliced together in any order to form the protein A. Consequently, the whole differential gene expression cannot be evaluated only by RNA analysis, but also has to be directly analyzed at the protein level (e.g. through sequencing of polypeptides, and mass spectrometry).
The fascinating aspect in living things is their complexity and yet individuality. In a cell that cannot be visualised by our naked eyes, there are thousands of molecules that act in an amazing specificity and coordination to ensure survival, protection, development, proliferation, and interplay with the environment. The information for all these functions, so called genetic information, has to be transmitted from one generation to the next. Moreover, the transfer and implementation of this genetic information has to pass-on safely to its destination (a biochemical function), within a cell or in an organism. To accomplish these, two requirements are to be considered: first, saving as much genetic information as possible in a compact form and small genetic material from which a flow of instructions for all functions diverge; second, ensuring the flexibility and action capability during the transmission and implementation of genetic information. Furthermore both requirements have to be established with a minimum of energy. The former is mainly maintained by the nucleotide sequences of the chromosomes, while phenomena such as RNA-mediated non-mendelian inheritance (Rassoulzadegan et al., 2006; Pearson, 2006), multi-functional proteins, transcription-mediated gene fusion (Akiva et al., 2006), alternative splicing and gene recombination not only enable the preservation of all the information in a relatively small genetic material but also provide flexibility and action capability.
Polypeptide Rearrangement would represent a further mechanism contributing to genetic diversity, as well as to the flexibility and action capability, particularly because of the energy benefit. To our present knowledge, protein splicing is catalysed through energy-recycling, where energy supply is not required. The time and energy consuming detour through transcription (which requires several transcription factors) and translation can be avoided and additional genes and transcription factors for the mosaic proteins need not be encoded. Thus, polypeptide rearrangement would represent an effective mechanism enabling rapid and effective adaptation to acute physiological modifications (such as stress, starvation, or infection). For example, the inhibition of translation by viruses can be overcome by using polypeptide rearrangement events to ensure the synthesis of (protective) proteins.
The implication of the polypeptide rearrangement hypothesis for protein antigenic properties and diseases
The immune surveillance against infections uses a simple strategy: all cells and tissues have permanently to report about their molecular contents to specialized immune cells. For that reason, molecular fractions (antigens) from lipids, sugars and proteins of cells and tissues are processed and presented by cell surface receptors: the MHC-I, MHC-II and CD1 molecules. The MHC/CD1-antigen complexes are monitored by T cells trough interactions with TCRs. During an infection the molecular contents of the cells are incorporated also with molecules from the infecting organism. The recognition of "non-self" antigens derived from this pathogen by TCR leads to the stimulation of the T cells that can directly kill the infected cells or mobilize them to eliminate the invader by themselves.
However an effective function of such kind of immune surveillance presupposes that the immune system does not react destructively against self-antigens. Thus the TCRs must be educated to tolerate self-antigens. This education takes place in both thymus (central tolerance) and peripheral lymphoid tissues (peripheral tolerance), where antigenpresenting cells (APCs) present self-antigens to T cells. In the thymus, developing lymphocytes with no marked reactivity against self-peptides are positively selected in the thymic cortex and enter the circulation as mature lymphocytes. In contrast, developing lymphocytes with strong reactivity against self-peptides undergo negative selection (deletion) in the thymic medulla (Starr et al., 2003). Thus ectopic expression of tissue-specific antigens by medullary thymic epithelial cells, termed promiscuous gene expression, is indispensable for central induction of T cell-tolerance towards peripheral tissues (Magalhães et al., 2006).
Polypeptide rearrangement events lead to the generation of "musaic proteins" from which new "hybrid-autoantigens" can be derived. The central induction of T cell-tolerance towards those "hybrid-autoantigens" presupposes that the promiscuous gene expression includes the ectopic production of "musaic proteins" and that the "hybrid-autoantigens" are efficiently presented by MHC alleles present in the individual. If "hybrid-autoantigens" may be absent during early T-cell selection or bind weakly to MHC-molecules, "hybridautoantigens"- reactive T cells will escape thymic deletion and initiate autoimmune diseases. For example, a similar scenario is known for the myelin basic protein (MBP), which is a target antigen in experimental autoimmune encephalomyelitis (EAE), an animal model of multiple sclerosis. MBP 1-11, which is expressed in the thymus, has been shown to bind weakly to the class II MHC molecule IAu and to form unstable peptide-MHC complexes, resulting in the escape of self-reactive cells from thymus (Kumar et al., 1995; Harrington et al., 1998; Anderson and Kuchroo, 2003).
Furthermore, autoimmune reactions can also be initiated, when aberrant polypeptide rearrangement events in the peripheral tissues lead to the generation of (not-tolerated)"neo-autoantigens". Similar scenario has been shown for many protein modifications in peripheral tissues, which affect antigenicity and presentation of protein antigens and result in the initiation of autoimmunity (Lisowska, 2002; Anderton, 2004; Cloos and Christgau, 2004). An example of this is provided by analysis of the T-cell determinants in collagen type II (CII) induced arthritis (CIA), a widely used mouse model for human rheumatoid arthritis (RA). The Most T-cell hybridomas were found to recognize the epitope CII(256- 270) glycosylated with a monosaccharide ( -Dgalactopyranose). This MHC-class II restricted T-cell epitope was immunodominant and arthritogenic (Corthay et al., 1998).
Finally, according to gene mutations that lead to impaired protein products and genetic diseases, aberrant polypeptide rearrangement events could also provide damaged mosaic proteins. Such aberrant rearrangements leave no marks (mutations) on gene sequences. Thus the detection of such protein-"mutations" requires sequencing of the complete sequence of the candidate proteins.
The polypeptide rearrangement hypothesis reveals a conceivable essential role for protein splicing and should open up a new field of investigation in protein chemistry. The complete expression profile can not be provided only by genome sequencing and analysing. In addition to the current focused attention on alternative pre-mRNA splicing, which is regarded as an important mechanism of protein diversity (Modrek and Lee, 2002), the concept of genetic diversity has to be expanded to include mosaic proteins. So far polypeptide rearrangement actually occurs, it is very likely that several known proteins would have other yet undiscovered functions by contributing to the generation of mosaic proteins. It is incumbent on protein chemists to demonstrate the occurrence of polypeptide rearrangement and to reveal the mechanisms underlying the function of proteins (mosaic proteins) as co-designers and producers of genetic diversity. However this demands e.g. the
development of high-throughput techniques for protein sequencing.
Finally, the proposed polypeptide rearrangement hypothesis can partially resurrect the initial proposal for the role of polypeptides, decades ago, as a carrier of the genetic information.
I am grateful to Dr. Cemalettin Bekpen, Dr. Revathy Uthaiah Chottekalapanda and Dr. Joe Dramiga for critical reading and discussions during the preparation of this manuscript. I am grateful to Professor Jürgen Heesemann for supporting my work.Funding to pay the Open Access publication charges for this article was provided by Max von Pettenkofer Institute, Germany.