中生网|生物技术|生物网址|生物软件|实用工具|本站导航

生物新闻

-

实验技术

-

软件教程

-

论文考试

-

肿瘤癌症

-

检验知识

-

仪器使用

-

健康知识

中生网 > 实验技术 > 技术文章 > 蛋白质组学指南

蛋白质组学指南

更新:2007年01月28日 阅读次数: 【字体:
Molecular Biologist's Guide to Proteomics Paul R. Graves1 and Timothy A. J. Haystead1,2* Department of Pharmacology and Cancer Biology, Duke University,1,1 and Serenex Inc.,2 Durham, North Carolina 277102
Microbiology and Molecular Biology Reviews, March 2002, p. 39-63, Vol. 66, No. 1
This review is intended to give the molecular biologist a rudimentary understanding of the technologies behind proteomics and their application to address biological questions. Entry of our laboratory into proteomics 5 years ago was driven by a need to define a complex mixture of proteins (36 proteins) we had affinity isolated that bound specifically to the catalytic subunit of protein phosphatase 1 (PP-1, a serine/threonine protein phosphatase that regulates multiple dephosphorylation events in cells) (26). We were faced with the task of trying to understand the significance of these proteins, and the only obvious way to begin to do this was to identify them by sequencing. We then bought an Applied Biosystems automated Edman sequencer (not having the budget for a mass spectrometer at the time). Since the majority of intact eukaryotic proteins are not immediately accessible to Edman sequencing due to posttranslational N-terminal modifications, we invented mixed-peptide sequencing (38). This method, described in detail later, essentially enables internal peptide sequence information to be derived from proteins electroblotted onto hydrophobic membranes. Using the mixed-peptide sequencing strategy, we identified all 36 proteins in about a week. The mixture contained at least two known PP-1 regulatory subunits, but most were identified in the expressed sequence tag or unannotated DNA databases and were novel proteins of unknown function. Since that time, we have been using various molecular biological approaches to determine the functions of some of these proteins. Herein lies the lesson of proteomics. Identifying long lists of potentially interesting proteins often generates more questions than it seeks to answer.
Despite learning this obvious lesson, our early sequencing experiences were an epiphany that has subsequently altered our whole scientific strategy for probing protein function in cells. The sequencing of the 36 proteins has opened new avenues to further explore the functions of PP-1 in intact cells. Because of increased sensitivity, our approaches now routinely use state-of-the-art mass spectrometry (MS) techniques. However, rather than using proteomics to simply characterize large numbers of proteins in complex mixtures, we see the real application of this technology as a tool to enhance the power of existing approaches currently used by the modern molecular biologist such as classical yeast and mouse genetics, tissue culture, protein expression systems, and site-directed mutagenesis. Importantly, the one message we would want the reader to take away from reading this review is that one should always let the biological question in mind drive the application of proteomics rather than simply engaging in an orgy of protein sequencing. From our experiences, we believe that if the appropriate controls are performed, proteomics is an extremely powerful approach for addressing important physiological questions. One should always design experiments to define a selected number of relevant proteins in the mixture of interest. Examples of such experiments that we routinely perform include defining early phosphorylation events in complex protein mixtures after hormone treatment of intact cells or comparing patterns of protein derived from a stimulated versus nonstimulated cell in an affinity pull-down experiment. Only the proteins that were specifically phosphorylated or bound in response to the stimulus are sequenced in the complex mixtures. Sequencing proteins that are regulated then has a meaningful outcome and directs all subsequent biological investigation. DefinitionsThe term "proteomics" was first coined in 1995 and was defined as the large-scale characterization of the entire protein complement of a cell line, tissue, or organism (13, 163, 167). Today, two definitions of proteomics are encountered. The first is the more classical definition, restricting the large-scale analysis of gene products to studies involving only proteins. The second and more inclusive definition combines protein studies with analyses that have a genetic readout such as mRNA analysis, genomics, and the yeast two-hybrid analysis (123). However, the goal of proteomics remains the same, i.e., to obtain a more global and integrated view of biology by studying all the proteins of a cell rather than each one individually.
Using the more inclusive definition of proteomics, many different areas of study are now grouped under the rubric of proteomics (Fig. 1). These include protein-protein interaction studies, protein modifications, protein function, and protein localization studies to name a few. The aim of proteomics is not only to identify all the proteins in a cell but also to create a complete three-dimensional (3-D) map of the cell indicating where proteins are located. These ambitious goals will certainly require the involvement of a large number of different disciplines such as molecular biology, biochemistry, and bioinformatics. It is likely that in bioinformatics alone, more powerful computers will have to be devised to organize the immense amount of information generated from these endeavors.

FIG. 1. Types of proteomics and their applications to biology.
 
In the quest to characterize the proteome of a given cell or organism, it should be remembered that the proteome is dynamic. The proteome of a cell will reflect the immediate environment in which it is studied. In response to internal or external cues, proteins can be modified by posttranslational modifications, undergo translocations within the cell, or be synthesized or degraded. Thus, examination of the proteome of a cell is like taking a "snapshot" of the protein environment at any given time. Considering all the possibilities, it is likely that any given genome can potentially give rise to an infinite number of proteomes.
Proteomics OriginsThe first protein studies that can be called proteomics began in 1975 with the introduction of the two-dimensional gel by O'Farrell (119), Klose (87), and Scheele (140), who began mapping proteins from Escherichia coli, mouse, and guinea pig, respectively. Although many proteins could be separated and visualized, they could not be identified. Despite these limitations, shortly thereafter a large-scale analysis of all human proteins was proposed. The goal of this project, termed the human protein index, was to use two-dimensional protein electrophoresis (2-DE) and other methods to catalog all human proteins (14). However, lack of funding and technical limitations prevented this project from continuing.
Although the development of 2-DE was a major step forward, the science of proteomics would have to wait until the proteins displayed by 2-DE could be identified. One problem that had to be overcome was the lack of sensitive protein-sequencing technology. Improving sensitivity was critical for success because biological samples are often limiting and both one-dimensional (1-D) and two-dimensional (2-D) gels have limits in protein loading capacity. The first major technology to emerge for the identification of proteins was the sequencing of proteins by Edman degradation (45). A major breakthrough was the development of microsequencing techniques for electroblotted proteins (6-8). This technique was used for the identification of proteins from 2-D gels to create the first 2-D databases (31). Improvements in microsequencing technology resulted in increased sensitivity of Edman sequencing in the 1990s to high-picomole amounts (6). One of the most important developments in protein identification has been the development of MS technology (11). In the last decade, the sensitivity of analysis and accuracy of results for protein identification by MS have increased by several orders of magnitude (11, 123). It is now estimated that proteins in the femtomolar range can be identified in gels. Because MS is more sensitive, can tolerate protein mixtures, and is amenable to high-throughput operations, it has essentially replaced Edman sequencing as the protein identification tool of choice. Genome InformationThe growth of proteomics is a direct result of advances made in large-scale nucleotide sequencing of expressed sequence tags and genomic DNA. Without this information, proteins could not be identified even with the improvements made in MS. Protein identification (by MS or Edman sequencing) relies on the presence of some form of database for the given organism (122, 146). The majority of DNA and protein sequence information has accumulated within the last 5 to 10 years (23). In 1995, the first complete genome of an organism was sequenced, that of Haemophilus influenzae (56). At the time of this writing, the sequencing of the genomes of 45 microorganisms has been completed and that of 170 more is under way (http://www.tiger.org/tdb/mdb/mdbcomplete.html). To date, five eukaryotic genomes have been completed: Arabidopsis thaliana (154), Saccharomyces cerevisiae (58), Schizosaccharomyces pombe (128), Caenorhabditis elegans (1), and Drosophila melanogaster (3, 113, 138). In addition, the rice (105), mouse (178a), and human (93, 161) genomes are near completion.
Why Proteomics?Many types of information cannot be obtained from the study of genes alone. For example, proteins, not genes, are responsible for the phenotypes of cells. It is impossible to elucidate mechanisms of disease, aging, and effects of the environment solely by studying the genome. Only through the study of proteins can protein modifications be characterized and the targets of drugs identified.
Annotation of the genome. One of the first applications of proteomics will be to identify the total number of genes in a given genome. This "functional annotation" of a genome is necessary because it is still difficult to predict genes accurately from genomic data (46). One problem is that the exon-intron structure of most genes cannot be accurately predicted by bioinformatics (43). To achieve this goal, genomic information will have to be integrated with data obtained from protein studies to confirm the existence of a particular gene. Protein expression studies. In recent years, the analysis of mRNA expression by various methods has become increasingly popular. These methods include serial analysis of gene expression (SAGE) (160) and DNA microarray technology (142, 143). However, the analysis of mRNA is not a direct reflection of the protein content in the cell. Consequently, many studies have now shown a poor correlation between mRNA and protein expression levels (2, 12, 67, 75). The formation of mRNA is only the first step in a long sequence of events resulting in the synthesis of a protein (Fig. 2). First, mRNA is subject to posttranscriptional control in the form of alternative splicing, polyadenylation, and mRNA editing (117). Many different protein isoforms can be generated from a single gene at this step. Second, mRNA then can be subject to regulation at the level of protein translation (78). Proteins, having been formed, are subject to posttranslational modification. It is estimated that up to 200 different types of posttranslational protein modification exist (89). Proteins can also be regulated by proteolysis (86) and compartmentalization (33). The average number of protein forms per gene was predicted to be one or two in bacteria, three in yeast, and three or more in humans (168). Therefore, it is clear that the tenet of "one gene, one protein" is an oversimplification. In addition, some bodily fluids such as serum or urine have no mRNA source and therefore cannot be studied by mRNA analysis.

FIG. 2. Mechanisms by which a single gene can give rise to multiple gene products. Multiple protein isoforms can be generated by RNA processing when RNA is alternatively spliced or edited to form mature mRNA. mRNA, in turn, can be regulated by stability and efficiency of translation. Proteins can be regulated by additional mechanisms, including posttranslational modification, proteolysis, or compartmentalization.
 
Protein function. According to one study, no function can be assigned to about one-third of the sequences in organisms for which the genomes have been sequenced (47). The complete identification of all proteins in a genome will aid the field of structural genomics in which the ultimate goal is to obtain 3-D structures for all proteins in a proteome. This is necessary because the functions of many proteins can only be inferred by examination of their 3-D structure (24).
Protein modifications. One of the most important applications of proteomics will be the characterization of posttranslational protein modifications. Proteins are known to be modified posttranslationally in response to a variety of intracellular and extracellular signals (74). For example, protein phosphorylation is an important signaling mechanism and disregulation of protein kinases or phosphatases can result in oncogenesis (74). By using a proteomics approach, changes in the modifications of many proteins expressed by a cell can be analyzed simultaneously. Protein localization and compartmentalization. One of the most important regulatory mechanisms known is protein localization. The mislocalization of proteins is known to have profound effects on cellular function (e.g., cystic fibrosis) (42). Proteomics aims to identify the subcellular location of each protein. This information can be used to create a 3-D protein map of the cell, providing novel information about protein regulation. Protein-protein interactions. Of fundamental importance in biology is the understanding of protein-protein interactions. The process of cell growth, programmed cell death, and the decision to proceed through the cell cycle are all regulated by signal transduction through protein complexes (127). Proteomics aims to develop a complete 3-D map of all protein interactions in the cell. One step toward this goal was recently completed for the microorganism Helicobacter pylori (133). Using the yeast two-hybrid method to detect protein interactions, 1,200 connections were identified between H. pylori proteins covering 46.6% of the genome (133). A comprehensive two-hybrid analysis has also been performed on all the proteins from the yeast S. cerevisiae (157). Types of ProteomicsProtein expression proteomics. The quantitative study of protein expression between samples that differ by some variable is known as expression proteomics. In this approach, protein expression of the entire proteome or of subproteomes between samples can be compared. Information from this approach can identify novel proteins in signal transduction or identify disease-specific proteins.
Structural proteomics. Proteomics studies whose goal is to map out the structure of protein complexes or the proteins present in a specific cellular organelle are known as "cell map" or structural proteomics (21). Structural proteomics attempts to identify all the proteins within a protein complex or organelle, determine where they are located, and characterize all protein-protein interactions. An example of structural proteomics was the recent analysis of the nuclear pore complex (137). Isolation of specific subcellular organelles or protein complexes by purification can greatly simplify the proteomic analysis (83). This information will help piece together the overall architecture of cells and explain how expression of certain proteins gives a cell its unique characteristics. Functional proteomics. "Functional proteomics" is a broad term for many specific, directed proteomics approaches. In some cases, specific subproteomes are isolated by affinity chromatography for further analysis. This could include the isolation of protein complexes or the use of protein ligands to isolate specific types of proteins. This approach allows a selected group of proteins to be studied and characterized and can provide important information about protein signaling, disease mechanisms or protein-drug interactions.
TECHNOLOGY OF PROTEOMICS
An integral part of the growth of proteomics has been in the advances made in protein technologies. Twenty-six years ago, when 2-DE was introduced, very few tools existed for proteomics. Since that time, new technologies have emerged and old ones have been improved in areas from protein separation to protein identification. However, it is also clear that it is still not feasible to conduct many types of proteomics because of limitations in technology. These problems will have to be solved and new technologies must be developed for proteomics to reach its full potential. A typical proteomics experiment (such as protein expression profiling) can be broken down into the following categories: (i) the separation and isolation of proteins from a cell line, tissue, or organism; (ii) the acquisition of protein structural information for the purposes of protein identification and characterization; and (iii) database utilization.
Separation and Isolation of ProteinsBy the very definition of proteomics, it is inevitable that complex protein mixtures will be encountered. Therefore, methods must exist to resolve these protein mixtures into their individual components so that the proteins can be visualized, identified, and characterized. The predominant technology for protein separation and isolation is polyacrylamide gel electrophoresis. Unlike the breakthroughs in molecular biology that eventually enabled the sequencing of the human genome, some aspects of protein science have shown little progress over the years. Protein separation technology is one of them. Since its inception some 32 years ago (92), protein electrophoresis still remains the most effective way to resolve a complex mixture of proteins. In many applications, it is at this stage where the bottleneck occurs. This is because 1- or 2-DE is a slow, tedious procedure that is not easily automated. However, until something replaces this methodology, it will remain an essential component of proteomics.
One- and two-dimensional gel electrophoresis. For many proteomics applications, 1-DE is the method of choice to resolve protein mixtures. In 1-DE, proteins are separated on the basis of molecular mass. Because proteins are solubilized in sodium dodecyl sulfate (SDS), protein solubility is rarely a problem. Moreover, 1-DE is simple to perform, is reproducible, and can be used to resolve proteins with molecular masses of 10 to 300 kDa. The most common application of 1-DE is the characterization of proteins after some form of protein purification. This is because of the limited resolving power of a 1-D gel. If a more complex protein mixture such as a crude cell lysate is encountered, then 2-DE can be used. In 2-DE, proteins are separated by two distinct properties. They are resolved according to their net charge in the first dimension and according to their molecular mass in the second dimension. The combination of these two techniques produces resolution far exceeding that obtained in 1-DE. One of the greatest strengths of 2-DE is the ability to resolve proteins that have undergone some form of posttranslational modification. This resolution is possible in 2-DE because many types of protein modifications confer a difference in charge as well as a change in mass on the protein. One such example is protein phosphorylation. Frequently, the phosphorylated form of a protein can be resolved from the nonphosphorylated form by 2-DE. In this case, a single phosphoprotein will appear as multiple spots on a 2-D gel (94). In addition, 2-DE can detect different forms of proteins that arise from alternative mRNA splicing or proteolytic processing. The primary application of 2-DE continues to be protein expression profiling. In this approach, the protein expression of any two samples can be qualitatively and quantitatively compared. The appearance or disappearance of spots can provide information about differential protein expression, while the intensity of those spots provides quantitative information about protein expression levels. Protein expression profiling can be used for samples from whole organisms, cell lines, tissues, or bodily fluids. Examples of this technique include the comparison of normal and diseased tissues (44) or of cells treated with various drugs or stimuli (30, 57, 69, 141, 144). An example of 2-DE used in protein profiling is shown in Fig. 3.
  FIG. 3. Protein expression profiling by 2-DE. Whole-cell lysates from nontransformed and Abelson murine leukemia virus (AMuLV)-transformed mouse fibroblasts were resolved by 2-DE, and proteins were visualized by silver staining. Differentially expressed proteins were excised from the gel and identified by MS.
 
[1]  href='http://www.seekbio.com/experiment/proteomics/255334_2.shtml'>[2]  href='http://www.seekbio.com/experiment/proteomics/255334_3.shtml'>[3]  href='http://www.seekbio.com/experiment/proteomics/255334_4.shtml'>[4]  href='http://www.seekbio.com/experiment/proteomics/255334_5.shtml'>[5]  href='http://www.seekbio.com/experiment/proteomics/255334_6.shtml'>[6]  href='http://www.seekbio.com/experiment/proteomics/255334_7.shtml'>[7]  href='http://www.seekbio.com/experiment/proteomics/255334_8.shtml'>[8]  href='http://www.seekbio.com/experiment/proteomics/255334_2.shtml'>下一页

关键词:蛋白质
相关栏目:实验技术 技术文章
中生网-生物软件-生物技术-生物网址-实验技术-本站导航-联系我们-收藏本站
©中生网-提供生物软件免费下载,生物实验Protocol,生物网址导航。
Copyright (C)2005-2014 www.seekbio.com All Rights Reserved.