681
Views
32
CrossRef citations to date
0
Altmetric
Original Article

Membrane protein structural biology – How far can the bugs take us? (Review)

, , , &
Pages 329-332 | Received 13 Feb 2007, Published online: 09 Jul 2009

Abstract

Membrane proteins are core components of many essential cellular processes, and high-resolution structural data is therefore highly sought after. However, owing to the many bottlenecks associated with membrane protein crystallization, progress has been slow. One major problem is our inability to obtain sufficient quantities of membrane proteins for crystallization trials. Traditionally, membrane proteins have been isolated from natural sources, or for prokaryotic proteins, expressed by recombinant techniques. We are however a long way away from a streamlined overproduction of eukaryotic proteins. With this technical limitation in mind, we have probed the question as to how far prokaryotic homologues can take us towards a structural understanding of the eukaryotic/human membrane proteome(s).

Introduction

α-helical membrane proteins (MPs) are involved in a wide range of vital cellular processes, as well as in numerous medical conditions. Many scientists in both academia and industry therefore strive for a molecular understanding of how they function. Essential to this process is a high-resolution structure, so that structure-function analyses can be conducted at an atomic level. High-resolution structures are therefore much anticipated, but progress has been slow: as of 1 January 2007, MPs comprised less than 0.5% of the Protein Data Bank (Berman et al. Citation2002), even though they constitute 20–30% of a typical proteome (Krogh et al. Citation2001, Granseth et al. Citation2005, Wallin & Von Heijne Citation1998).

The striking paucity of MP structures, as compared to soluble proteins, does not reflect the level of investment or interest. Accordingly, MP structures are extremely well received by the scientific community. The majority of novel structures (92%) have found their way into high-impact journals like Nature, Science or Cell (). These structures are also frequently rewarded with an editorial comment (55%) and/or an illustration on the cover (31%) of the journal. Most notably, 8% of the novel structures have helped win a Nobel Prize. So if the reward justifies the effort, what's the problem?

Figure 1.  Membrane protein structures are hot. Analysis of high-resolution membrane protein structures collected by Stephen White (http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html) indicates that novel structures (black bars) are often published in high-impact journals (i.e. Nature, Science or Cell) and accompanied by ‘high-profile accessories’ (i.e., commentaries and covers). Surprisingly, improved or follow-up structures (white bars) also maintain impact, albeit less than novel structures.

Figure 1.  Membrane protein structures are hot. Analysis of high-resolution membrane protein structures collected by Stephen White (http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html) indicates that novel structures (black bars) are often published in high-impact journals (i.e. Nature, Science or Cell) and accompanied by ‘high-profile accessories’ (i.e., commentaries and covers). Surprisingly, improved or follow-up structures (white bars) also maintain impact, albeit less than novel structures.

The poor progress of structural determination can be attributed to the fact that MPs are incompatible with the structural genomics approaches that have been so successful for their soluble counterparts. A major bottleneck is obtaining sufficient quantities of material for crystallization trials. Early crystallographic success was possible with proteins that were naturally enriched in biological membranes (, Supplementary – online version only), but this approach is obviously not applicable for the majority of MPs. A more viable approach is the production of recombinant proteins using overexpression-hosts such as Escherichia coli. This approach has already had a significant impact, providing material for many structures in recent years (). To date, however, recombinant overexpression has worked mainly for prokaryotic MPs, as eukaryotic MPs are still very difficult to overexpress (reviewed in Drew et al. Citation2003, Wagner et al. Citation2006). Despite considerable effort, only 5 eukaryotic MP structures have been obtained using material produced by recombinant techniques: three were overexpressed in Pichia pastoris and two were overexpressed in E. coli. Hopefully, this is the beginning of a positive trend.

Figure 2.  Eukaryotic membrane proteins produced by recombinant techniques are scarce in structural biology. Analysis of high resolution membrane protein structures collected by Stephen White (http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html) indicates that most structures have been solved from proteins which have been purified from natural sources (striped bar), see inset. Of these, 44% were prokaryotic and 56% eukaryotic. Another successful approach is the production of recombinant prokaryotic proteins (light grey bar), see inset. In almost all of these cases the proteins have been produced in E. coli (not shown). Eukaryotic membrane proteins have not been successfully produced in the structural biology community to date (dark grey bar). Closer inspection indicates that most of the early structures were solved from proteins purified from natural sources, whereas recombinant technology arrived more recently. We are grateful to Niek Dekker (Astra Zeneca, Sweden) for assistance with this Figure.

Figure 2.  Eukaryotic membrane proteins produced by recombinant techniques are scarce in structural biology. Analysis of high resolution membrane protein structures collected by Stephen White (http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html) indicates that most structures have been solved from proteins which have been purified from natural sources (striped bar), see inset. Of these, 44% were prokaryotic and 56% eukaryotic. Another successful approach is the production of recombinant prokaryotic proteins (light grey bar), see inset. In almost all of these cases the proteins have been produced in E. coli (not shown). Eukaryotic membrane proteins have not been successfully produced in the structural biology community to date (dark grey bar). Closer inspection indicates that most of the early structures were solved from proteins purified from natural sources, whereas recombinant technology arrived more recently. We are grateful to Niek Dekker (Astra Zeneca, Sweden) for assistance with this Figure.

Supplementary Table I. Structural status of the 25 largest MP families. Proteins collected by Stephen White (http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html).

Given the difficulties of overexpressing eukaryotic MPs, it seems as if the prokaryotic pathway is currently the most viable way forward in our aim for a structural understanding of MPs. The critical question then becomes: how far can prokaryotic MPs take us towards the ultimate goal of understanding eukaryotic/human MPs?

Eukaryotic/human MPs with prokaryotic homologs

To answer this question, we have searched for MP families that are shared between prokaryotes and eukaryotes in a recently compiled database (Oberai et al. Citation2006). This database contains 3961 MP families derived from the fully sequenced genomes of 13 eukaryotic and 82 bacterial/archaeal organisms (see Box 1). It is obvious at a first glance that prokaryotic membrane proteomes are poor models for eukaryotic organisms, as only 256 families are common between the two groups, representing a mere ∼13% of the eukaryotic MP families (and ∼14% of the human MP families) (). On the positive side, a closer look reveals that while these so-called ‘universal families’ are few, they are on average larger than ‘unique families’. Given their ubiquity in nature, these omnipresent families are of particular interest to the scientific community, and have therefore been preferentially targeted by the structural genomics community (Supplementary ).

Figure 3.  Venn diagrams, showing the distribution of membrane protein families. The upper panel shows the distribution of families between prokaryotes and eukaryotes and the lower panel shows the distribution of families between prokaryotes and Homo sapiens. The large number denotes the number of families that are common to that category. The number in superscript denotes the average family size and subscript, how many families in that category contain at least one member with known structure. It is evident that few are common to both prokaryotes and eukaryotes/Homo sapiens. These families are however very large (i.e., conserved across many species).

Figure 3.  Venn diagrams, showing the distribution of membrane protein families. The upper panel shows the distribution of families between prokaryotes and eukaryotes and the lower panel shows the distribution of families between prokaryotes and Homo sapiens. The large number denotes the number of families that are common to that category. The number in superscript denotes the average family size and subscript, how many families in that category contain at least one member with known structure. It is evident that few are common to both prokaryotes and eukaryotes/Homo sapiens. These families are however very large (i.e., conserved across many species).

Although they are hotly pursued, there are still quite a few (>200) universal families lacking a descriptive structure, and in these cases a prokaryotic MP may shed light on the structure and function of a eukaryotic/human homolog (see Box 2). These families include ABC transporter #1 (amino acid, phosphate, ferric, nitrate, nickel, taurine, sugar), Transporter #1 (cationic acid, aromatic amino acid, choline K+ uptake) and Transporter #2 (drug/metabolite, amino acid transport). With the ever-increasing genome-wide screenings of orthologous proteins for crystallization trials, structures for these families can be anticipated in the near future.

However, in our struggle to obtain a detailed structural understanding of the eukaryotic MPs, it is clear that prokaryotic MPs will only take us to ‘first base’ (see also Fleishman et al. Citation2006). Approximately 85% of the eukaryotic families do not have a prokaryotic homolog, so the structures of these proteins will need to be solved from eukaryotic sources (). Further, prokaryotic MPs can ultimately only serve as models for eukaryotic/human proteins, and eventually there will be a desire to solve the structure for the actual protein of interest. This tells us that in the long term, we must solve the overexpression-bottleneck for eukaryotic MPs. Until then we might have to make the most of the natural sources

Box 1. Membrane proteins: what is a family and what is a fold?

According to the Structural Classification of Proteins (SCOP), two proteins belong to the same fold if they are similarly ordered in three dimensions, i.e., if their secondary structures match and they have a similar topological arrangement (Murzin et al. Citation1995). Although two proteins share the same fold, they do not have to be evolutionary related. A deeper level of structural similarity is achieved on the family level: two proteins belong to the same family if they are evolutionary related, in which case they typically have a pair-wise sequence identity of ≥30%. Currently, SCOP (release 1.71) contains 941 different folds and 3004 families. These numbers are based on proteins with known three-dimensional structures. Some 34 folds and 44 families contain α-helical MPs. Notably, for both soluble and membrane-integrated proteins, the majority of families can be attributed to a few folds, while there is large number of folds that comprise only a handful of families (Govindarajan et al. Citation1999, Ubarretxena-Belandia & Engelman Citation2001, Oberai et al. Citation2006).

Since many protein classification resources (such as SCOP) only classify MPs with known three-dimensional structures, and owing to the scarcity of structures, several groups have attempted to make projections about the size of the structural space of MPs, based on predicted transmembrane regions and sequence similarity (Jones Citation1998, Martin-Galiano & Frishman Citation2006, Oberai et al. Citation2006). Oberai and coworkers conclude that in order to have 80% of all MPs assigned to a fold or family, ∼300 folds and ∼700 families are required. They estimate that if the number of MP structures increases exponentially, as predicted by Stephen White (White Citation2004), this goal should be reached somewhere between the years 2020 and 2034. In a similar study by Martin-Galiano and Frishman (Citation2006), 24 of 266 MP sequence clusters (corresponding to folds) already contain at least one structure, and these clusters cover approximately 70% of all MPs. However, this study was based on prokaryotic MPs only.

Box 2. 3D structure modelling of membrane proteins

How far can the structure of a prokaryotic protein take us towards an understanding of a eukaryotic/human homolog? In a recent study (Forrest et al. Citation2006), the authors assessed how successful homology modelling is for 8 different MP families. It was possible to create acceptable models (the Cα Root Mean Square Deviation (RMSD) <2 Å compared to the native structure) if the sequence identity was above 30%. One of the limitations when assessing the quality of homology modeling of membrane proteins is that there are not very many families that contain more than two structures and hence that can be used as test cases.

But one should not despair: if there is no suitable target structure to begin the modeling from, recent progress of de novo structure prediction methods show some promise (Yarov-Yarovoy et al. Citation2006). Twelve multipass membrane proteins, both fragments and full structures, were modeled by the Rosetta-Membrane method and the corresponding predictions often had significant regions with an RMSD within 4 Å from the native structure. The results were comparable to the accuracy of low-resolution predictions made for water-soluble proteins of the same length. However, Rosetta-Membrane is not yet able to create the full-atom models of the proteins needed for docking studies.

Acknowledgements

We thank Drs Amit Oberai and James U. Bowie for kindly providing data. This work was supported by grants from the Swedish Research Council, the Marianne and Marcus Wallenberg Foundation, and the Swedish Foundation for Strategic Research to GvH and DOD.

References

  • Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, Fagan P, Marvin J, Padilla D, Ravichandran V, Schneider B, Thanki N, Weissig H, Westbrook JD, Zardecki C. The Protein Data Bank. Acta Crystallogr D Biol Crystallogr 2002; 58: 899–907
  • Drew D, Froderberg L, Baars L, de Gier JW. Assembly and overexpression of membrane proteins in Escherichia coli. Biochim Biophys Acta 2003; 1610: 3–10
  • Fleishman SJ, Unger VM, Ben-Tal N. Transmembrane protein structures without X-rays. Trends Biochem Sci 2006; 31: 106–113
  • Forrest LR, Tang CL, Honig B. On the accuracy of homology modeling and sequence alignment methods applied to membrane proteins. Biophys J 2006; 91: 508–517
  • Govindarajan S, Recabarren R, Goldstein RA. Estimating the total number of protein folds. Proteins 1999; 35: 408–414
  • Granseth E, Daley DO, Rapp M, Melen K, von Heijne G. Experimentally constrained topology models for 51,208 bacterial inner membrane proteins. J Mol Biol 2005; 352: 489–494
  • Jones DT. Do transmembrane protein superfolds exist?. FEBS Lett 1998; 423: 281–285
  • Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001; 305: 567–580
  • Martin-Galiano AJ, Frishman D. Defining the fold space of membrane proteins: the CAMPS database. Proteins 2006; 64: 906–922
  • Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995; 247: 536–540
  • Oberai A, Ihm Y, Kim S, Bowie JU. A limited universe of membrane protein families and folds. Protein Sci 2006; 15: 1723–1734
  • Ubarretxena-Belandia I, Engelman DM. Helical membrane proteins: diversity of functions in the context of simple architecture. Curr Opin Struct Biol 2001; 11: 370–376
  • Wagner S, Bader ML, Drew D, de Gier JW. Rationalizing membrane protein overexpression. Trends Biotechnol 2006; 24: 364–371
  • Wallin E, Von Heijne G. Genome-wide analysis of integral membrane proteins from eubacterial, archaean and eukaryotic organisms. Prot Sci 1998; 7(4)1029–1038
  • White SH. The progress of membrane protein structure determination. Protein Sci 2004; 13: 1948–1949
  • Yarov-Yarovoy V, Schonbrun J, Baker D. Multipass membrane protein structure prediction using Rosetta. Proteins 2006; 62: 1010–1025

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.