ISSN: 0974-276X
Editorial - (2013) Volume 6, Issue 1
A key step in proteomics is matching of sequences obtained from tandem mass spectrometry of (usually) tryptic peptides to genomes held in public domain databases [1]. Such genomes first became available from important model organisms such as human, mouse, rat and a range of other representative animal, plant and microbial species [2]. These model species therefore became accessible for proteomic studies and rich and reliable data could be obtained. However, scientists working with non-model organisms often failed to identify proteins reliably notwithstanding the availability of de novo sequence data from tandem MS because non-model species tend to be poorly-represented in public-domain sequence databases [3]. Often, such studies have had to rely on sequence similarity with organisms represented in sequence databases rather than sequence identity and sometimes, no reliable match was possible even with this approach. However, recent improvements in high-throughput genome sequencing such as “next-generation sequencing” have significantly increased the rate of genome sequencing which now outpaces Moore’s Law, more than doubling each year [4]. For example, in 2007, a single sequencing run could generate 1 Gbp of sequence data but, by 2011, such a run could generate a Tbp of data, almost one thousand times more in just four years. This has greatly increased the speed of genome sequencing and the list of whole genomes grows apace [2].
A good example of use of non-model organisms is in environmental toxicology (ecotoxicology) where “sentinel species” have long been used to assess the pollution status of natural environments [5]. The idea is that such organisms may show population-level, physiological or biochemical effects when stressed by adverse environmental conditions due to climate change or exposure to chemical pollutants. Much interest has centred on aquatic organisms inhabiting freshwater, estuarine or coastal habitats since a majority of human settlement worldwide tends to be within a relatively short distance of coastal zones. Thus, key sources of pollution and environmental degradation such as industrial, agricultural and shipping activities are often adjacent to rivers, estuaries and coastlines. More recently, there is growing concern about the implications of climate change, habitat loss, deterioration of water quality and ocean acidification arising from anthropogenic (manmade) pollutants, especially emerging categories such as endocrine disruptors [5], nanomaterials [6], and human/animal pharmaceuticals [7]. There was a tendency to use robust organisms capable of withstanding a range of pollution and environmental stress but with a widespread geographical distribution worldwide and relatively sessile life-style. Examples would include bivalves such as the genus Mytilus (in estuaries) and the water-flea Daphnia (freshwater). Since the 1960’s much work has concentrated on identification of biochemical indices of damage (biomarkers) within such species such as lipid peroxidation or changes in stress-response protein profiles. In principle, proteomics provides a welcome complement to these classical approaches [8,9] but there is a problem–many aquatic species have especially large genomes (as big as or even bigger than the human genome). Thus, they were poorly-represented in public-domain sequence databases and this severely limited the potential of proteomics approaches in ecotoxicology [8,9]. However, the accelerating pace of genome sequencing described above has recently made available the full genomes of Daphnia pulex [10] and the pacific oyster, Crassostrea gigas [11], while that of Mytilus is eagerly awaited. These developments have significantly improved prospects for proteomics as a tool in ecotoxicology research into the future. This will make possible identification of novel biomarkers and elucidation of toxicity mechanisms as well as data-rich explorations of effects on entire protein networks.