763
Views
0
CrossRef citations to date
0
Altmetric
Comment

Deepening constantly understanding of protein folding problem

, , , &
Pages 956-960 | Published online: 05 Feb 2013

1. Introduction

The problem of protein folding has achieved tremendous advances over the past few decades although huge challenges still remain. The famous Anfinsen experiment showed that protein can fold reversibly, implying that native structures of some small globular proteins are thermodynamically stable states, and therefore are conformations at the global minima of their accessible free energies. Levinthal made the argument that there are too many possible conformations for proteins to find the native structure in the conformational space by random searching only based on simple phenomenological kinetics models. Levinthal concluded that proteins must fold by specific folding pathways. The argument led to a search for folding pathways. The classical experiments generally probe only the average behavior of the protein and are not able to resolve much atomic detail. Some confusion has arisen for using the single term pathway for both microscopic and macroscopic ideas.

The energy landscape picture replaces the pathway concept of sequential events with the funnel concept of parallel events and emphasizes the ensemble nature of protein conformations, which derives from advances in both experiment and theory. The landscape eliminates the pathway idea and sees folding can funnel to a single stable state by multiple routes in conformations. It has provided a powerful conceptual framework to rationalize protein folding and unfolding. The concept has promoted much of the recent progress in understanding the process of protein folding.

Protein folding researches have a close relation with development of techniques. Recently, the techniques for studying protein folding have developed tremendously. Computer-aided molecular dynamics (MD) simulation has become a more and more powerful tool to characterize protein folding/unfolding. Graphics processing unit (GPU) is becoming increasingly useful for probing protein folding with the most computer power. Now, the computer simulation-based techniques have allowed describing protein folding and unfolding at atomistic level. Protein folding simulations are coming of age. Advances in experimental techniques have made it possible to observe detailed information about the successive conformations that occur during the protein folding process. Single molecular operating techniques are becoming important experiment tools to study protein folding/unfolding. The combined use of experimental data and computer simulations has been shown to be very efficient in protein folding/unfolding. The common goal for theoreticians and experimentalists is to determine the folding process at atomic resolution by combining experimental and computational results.

It is evident that protein folding has made significant progress in the past half-century. The energy landscape funnel has provided a correct framework for protein folding based on lots of theory and experiment researches. And the knowledge on protein folding has also promoted understanding of human diseases. However, protein folding is still an open problem such as protein folding in the cell, intrinsically disordered proteins (IDPs) folding and atomic level folding events. So many people are concentrating on the open problem, Arieh Ben-Naim is one of them. In his continual studies, especially the recent paper (Ben-Naim, Citation2012), he argued that switching from a target-based to a caused-based approach and adopting the hydrophilic paradigm leads straightforwardly to a solution of the protein folding problem. Ben-Naim argued his view on protein folding by general logical speculations, which is contrary with the main aspect of current studies. His study may give us some new enlightenment for studying the protein folding problem. However, it is doubtable that the protein folding problem will be solved along this pathway.

2 The effective method to study protein folding by combining experiment and computer simulation

The primary protein folding problem is how a sequence of amino acid specifies both a native structure and the process to attain that state. Over the past several decades, the mechanisms of protein folding have been widely studied by using both experimental techniques and computational methods. The experimental and computational methods are mutually complementary. On one hand, a lot of methods are used to extract more information from experimental data. On the other hand, such methods are complemented by theoretical calculations that have been able to reproduce experimental results and in some cases, predict new ones. Our understanding of protein unfolding is being pushed forward by the symbiosis of experimental and theoretical methods.

As with advances in experimental biophysical techniques, it is possible to obtain detailed information about the native, transient, intermediate, and denatured states of proteins. In order to investigate how native states originate from the very first intermediate states of the folding process, several experimental techniques such as hydrogen exchange, NMR, fluorescence transfer, and triplet excitation techniques have been used to detect long-range interactions in transition states and intermediate states. Atomic force microscopy (AFM) applies a mechanical force to pick up of a single protein molecule. With this technique, experiments are performed under non-equilibrium conditions, and a reaction coordinates is imposed on the unfolding process. With mechanical force and reaction coordinates, AFM can provide free-energy landscape of the mechanical unfolding of proteins and unique information about the intermediates. Protein engineering is providing residue-specific information about the structures of intermediates and transition states of folding. Transition state is the only entity that can be studied in the folding of two-state proteins. In general, value analysis is the only experimental technique available for fine structure analysis of transition states. A method which is proposed by Jane Clarke’s group for quantitatively probing forced unfolding pathways extends the technique of -value analysis to provide a high-resolution picture of the transition state for forced unfolding. Experimental studies of protein folding are hampered by the fact that only low-resolution structural data can be obtained with sufficient temporal resolution (Freddolino, Harrison, Liu, & Schulten, Citation2010). MD simulations are complementary to experimental techniques and provide atomic insight into unfolding/folding. High precision simulations not only explain experimental data, but also provide new and interesting avenues of investigation.

MD simulations throughout their history have faced two mutual antagonistic challenges. On one hand, the accuracy of modern MD simulations depends on the accuracy of force fields; however, the force field is poor in describing long-term structural dynamics of proteins. Since the long MD simulation for the protein folding process, it is necessary to further refine parameters of force fields or to use new development for accurate folding simulations. On the other hand, protein folding has a heterogeneous nature. In order to obtain a complete picture of the folding process, it is necessary to provide as many trajectories as possible. The MD simulation trajectories have long correlation times, and even a single protein folding trajectory requires an immense amount of computing effort. In order to obtain reasonable statistics from the trajectories, MD simulation should be as long as possible.

To address the timescale, a few approaches have been applied to produce recent folding simulations. It is particularly powerful to expand the general-purpose computing resources in tandem for improving the performance of MD programs. This method provides continuously increasing simulation duration. Another method to the timescale in MD folding simulations is the use of special-purpose hardware. Anton platform is a special-purpose supercomputer, which carries out the various tasks required in a MD simulation by using sets of application-specific integrated circuits.

The development of application programing interfaces targeted at general-purpose scientific computing has made the GPUs in general-purpose computing and accepted as serious tools for the economically efficient acceleration of an extensive range of scientific problems. In order to obtain optimal performance, GPUs rely on parallel processing of a large array of data using identical procedures to obtain optimal performance and MD simulations can be mapped well to the GPU architecture. The computational complexity and fine-grained parallelism of MD simulations of protein folding make them an ideal candidate for implementation on GPUs.

Steered molecular dynamics (SMD) simulations have emerged as a flexible and powerful tool for providing information about the energy landscape driving protein unfolding processes, and information about the time-resolved complex creation. SMD mimics AFM experiments of forced unfolding by fixing one terminus and restraining the other terminus to a point in space that is moving with constant velocity and in a chosen direction. Additionally, quantitative estimates can be obtained if non-equilibrium descriptions for the analysis are employed. The free energies of the proposed unfolding pathway have been described by a using SMD approach through a series of stages that allows for better convergence along nonlinear and long-distance pathways.

The concept of the free-energy landscape has promoted much of the recent progress in understanding the process of protein folding. Two essential principles of protein folding are summarized by this picture. The first is that folding is a stochastic process in which the free energy decreases spontaneously, and the second is that evolution has selected amino acid sequences that avoid misfolding, long-lived metastable traps, and aggregation. Equivalent importance is the fact that landscapes provide a convenient framework for visualizing and interpreting experimental data on the thermodynamics and the kinetics of proteins. The free-energy landscapes of protein folding/unfolding have been studied by many MD simulations or combined use of experimental data throughout simulations. For example, in the property space, the unfolding trajectory ensemble of GB1 constructs an unfolding “funnel model” of protein (Wang, Zhao, Dou, & Zhang, Citation2008). The characters of Box5 intermediate were described by combination of NMR data and MD simulations (Zhao, Liu, Cao, Liu, & Wang, Citation2011).

In conclusion, the combined use of the experimental data and computer simulations has been shown to be very effective in protein folding and unfolding, which shows that funnel model is a correct formwork for protein folding study to a degree. Furthermore, the combined use of the latest experimental data and computer simulations has been able to reveal detailed events of protein folding, indicating further that the funnel model has promoted solving the protein folding problem.

3 Target-based and cause-based protein folding

As it is well known that most global proteins have unique native states which are important for their biological functions. Protein folding is the process of achieving its native stable structure which is determined by protein sequence, in vivo environment (such as ions, molecular chaperon, pH value, et al.), and evolution. That is to say, the protein knows its final conformation, and the folding is target-based.

For the case of the drunken person supposed by Ben-Naim (Citation2012), the drunken man cannot reach every point of the city for there are many buildings. There are two other possible factors for the drunken man reaching point Y from initial point A with more probabilities. One is subjective. The drunken man is familiar with this pathway or there are some things along the pathway which attract him. The other is external. For example, the street from point A to point Y walked by a drunken man is broad, smooth, and bright, and the others are narrow, uneven, and dark. As there are some crossroads in the big city, there is more than one way from point A to point Y.

For the case of protein folding, it is true that protein cannot “walk” all the configurational space before finding the native state. Protein does not have the subjective preference of a drunken man. The folding process is restrained by energy obstacles (as the building in the city or the bad road condition) caused by the inner interactions among residues. The characters of residue side chain, such as length, polarity, and phenyl, cause the protein to give up some possible conformations and “guide” the proteins to their native conformations. Different residues may have similar side-chain character, which induce the fact that proteins with different amino sequence may have similar 3D topological structure. Protein folding is the process of cause-guided by some interactions to a preassigned target.

The acceptable funnel theory implies that proteins fold from high-energy state (unfolded state) to lowest energy state (native state) as water drains through the funnel of Gibbs energy landscape. As the drain is not directed (target-guided), there are many local minima. The existence of an intermediate state of protein is good evidence. The folding target, lowest energy state (native state), may not be the global minimum of Gibbs energy landscape, but the one that the protein can reach as shown in Figure 5 by Ben-Naim (Citation2012). Under abnormal conditions, protein can visit those absolute minima and remain stable. One of powerful evidence is conformational disease, such as bovine encephalopathy. The folding of those proteins is a combination of target-based and cause-guided.

4 Dominant forces for protein folding

Proteins carry out the most important biological functions, which is highly dependent on its folded structure. In summary, there are many forces that can contribute to the diverse conformations of natural proteins, such as electrostatics (including classical charge repulsions and ion pairing), hydrogen bonds, van der Waals interactions, and hydration effects. Among these forces, hydrogen bonds and hydration effects are deemed as the two kinds of most important factors for protein folding and stabilizing the native 3D structures. The fundamental criterion for a dominant driving force is that it must explain why the folded state is advantageous relative to the unfolded state. In an earlier published review by Kauzmann, the importance of hydrophobic effect was first introduced (Kauzmann, Citation1959). Now, more than half a century has passed, lots of evidence supports the viewpoint that hydrophobic effect is the dominant driving force of protein folding and stability. In a comprehensive review by Dill (Citation1990), it was shown that for proteins with globular structures, these non-polar residues are clustered in a core to avoid contact with water molecules, which appear to be more strongly conserved and correlated with structure than other types of interactions. Furthermore, some computer simulating works also indicated the hydrophobic resides play important roles for protein folding. Even so, the molecular mechanism of hydrophobicity is still clear, and many studies indicated that other forces can also play comparable roles in proteins structures. In 1996, Pace, Shirley, McNutt, and Gajiwala (Citation1996) calculated the free-energy change of ribonuclease T1, and the results showed that hydrogen bonds make a larger contribution than hydrophobic effect to the stability of this protein. Considering the fact that the hydrogen bonds are usually formed by burying many polar groups, the authors subtracted the cost of these burial groups; thus, the net contribution of hydrogen bonds to the stability is only 0.6 kcal/mol, which implies that hydrophobic effect makes a larger contribution than that of hydrogen bonds. Then, the authors infer that hydrogen bonds and the hydrophobic effect make comparable contributions to the stability of globular proteins. Similar conclusions can also be reached by other researches based on different methods and different proteins. Though the importance of hydration effect in protein folding and stability has been fully recognized, the debate still continues. In the newly published work by Ben-Naim (Citation2012), the author oppugned the dominant role of hydrophobic force and deemed that the hydrophilic effect plays a dominant role instead. In this work, the author thought that those evidences adopted in previous works were not evidential indeed. Regretfully, one cannot obtain practical information from this paper, as no convincing examples are proposed by the author. For the complex non-linear biological system, it is difficult to resolve the natural problems by solely physical equations, which can be reflected by the continuous improvement of force field for protein-coding problems. In fact, the hydration effect including both hydrophobic and hydrophilic has been emphasized in some models to explore the mechanism of protein folding. Among these models, the HP (hydrophobic-hydrophilic) model was the most famous one that has been extensively applied in many protein folding related researches. On the other hand, in recent years, with the development of bioinformatics, some physiochemical properties including hydrophobicity have been broadly used in many predicting works (Yu, Sun, & Wang, Citation2011). Then, more objective discussions of the protein involved forces are necessary. In the future, the author may provide a more quantitative description of the hydrophilic effect.

5 Folding of IDPs

IDPs are those proteins that lack fixed structure under physiological conditions and perform many crucial biological functions. In a human genome, most IDPs associate with human diseases such as cancers, diabetes, and neurodegenerative disease. For the absence of “order-promoting” residues, those proteins are disordered or partially disordered in unbound states. Many IDPs take part in one-to-many and many-to-one binding. The disorder-to-order transition will happen once they couple with partners. Upon binding different partners, some IDPs may adopt a different stable structure. The transactivation domain (residues 1–73) of the tumor suppressor p53 is intrinsically disordered and has only some residual structures in non-bounded state. In complex with replication protein A, the P53 fragment (residues 37–57) is folded into two stable helices (residues Asp41–Met44 and Pro47–Thr55). While in complex with the nuclear coactivator binding domain of CBP, this P53 domain forms two helices with Phe19 to Leu25 and Pro47 to Trp53 (Lee, Martinez-Yamout, Dyson, & Wright, Citation2010). Such folding is also called ligand-induced folding. It is difficult to declare the folding process of IDPs is target-guided or cause-guided. IDPs have one-to-many binding modes. Then, the folded stable structure of IDPs is not unique. It means that the stable structure is not corresponding to the absolute energy minimum.

The transactivation domain of P53 and the nuclear coactivator binding domain of CBP are both IDPs, NMR experiments indicated that the folding is mutual synergistic, and the formation of their complex is driven largely by hydrophobic contacts that form a stable intermolecular hydrophobic core. On the other hand, the IDPs are significantly depleted in hydrophobic (Ile, Leu, and Val) and aromatic (Trp, Tyr, and Phe) amino acid residues, which form and stabilize the hydrophobic cores of folded globular proteins. It implies that the hydrophobic interactions play a key role in protein folding. IDP folding has many open problems to be dependently studied.

6 Remark conclusions

Development of theoretical and experimental techniques has advanced the studies of protein folding related problems during the past half-century, from Anfinsen principles to Levinthal paradox, from pathway to landscape funnel, and from single domain to discovery of chaperonins to IDPs folding. It is efficient to study protein folding by combining experimental data and computer simulations. In the work by Ben-Naim, some conclusions may lead to more alternative insights into the protein folding problem, but most are only derived from logical speculations, which need to be verified by necessary “wet” or “dry” experimental evidences.

Acknowledgements

We thank the financial support from the Chinese Natural Science Foundation (Grant numbers 30970561, 31000324 and 61271378).

References

  • Ben-Naim , A. 2012 . Levinthal’s question revisited, and answered . Journal of Biomolecular Structure & Dynamics , 30 : 113 – 124 .
  • Dill , K. A. 1990 . Dominant forces in protein folding . Biochemistry , 29 : 133 – 155 .
  • Freddolino , P. L. , Harrison , C. B. , Liu , Y. and Schulten , K. 2010 . Challenges in protein folding simulations: Timescale, representation, and analysis . Nature Physics , 6 : 751 – 758 .
  • Kauzmann , W. 1959 . Some factors in the interpretation of protein denaturation . Advances in Protein Chemistry , 14 : 1 – 63 .
  • Lee , C. W. , Martinez-Yamout , M. A. , Dyson , H. J. and Wright , P. E. 2010 . Structure of the p53 transactivation domain in complex with the nuclear receptor coactivator binding domain of CREB binding protein . Biochemistry , 49 : 9964 – 9971 .
  • Pace , C. , Shirley , B. , McNutt , M. and Gajiwala , K. 1996 . Forces contributing to the conformational stability of proteins . FASEB Journal , 10 : 75 – 83 .
  • Wang , J. H. , Zhao , L. , Dou , X. and Zhang , Z. 2008 . Study of multiple unfolding trajectories and unfolded states of the protein GB1 under the physical property space . Journal of Biomolecular Structure & Dynamics , 25 : 609 – 619 .
  • Yu , J. F. , Sun , X. and Wang , J. H. 2011 . A novel 2D graphical representation of protein sequence based on individual amino acid . International Journal of Quantum Chemistry , 111 : 2835 – 2843 .
  • Zhao , L. , Liu , Z. , Cao , Z. , Liu , H. and Wang , J. H. 2011 . Determination of thermal intermediate state ensemble of box 5 with restrained molecular dynamics simulations . Computational and Theoretical Chemistry , 978 : 152 – 159 .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.