648
Views
8
CrossRef citations to date
0
Altmetric
Applications and Case Studies

Collective Estimation of Multiple Bivariate Density Functions With Application to Angular-Sampling-Based Protein Loop Modeling

Pages 43-56 | Received 01 Nov 2012, Published online: 05 May 2016

REFERENCES

  • Akaike, H. (1973), “Information Theory and an Extension of the Maximum Likelihood Principle,” Proceedings of the 2nd International Symposium on Information Theory, Budapest, 267–281.
  • Altis, A., Nguyen, P.H., Hegger, R., and Stock, G. (2007), “Dihedral Angle Principal Component Analysis of Molecular Dynamics Simulations,” Journal of Chemical Physics, 126, 244111.
  • Altis, A., Otten, M., Nguyen, P.H., Hegger, R., and Stock, G. (2008), “Construction of the Free Energy Landscape of Biomolecules via Dihedral Angle Principal Component Analysis,” Journal of Chemical Physics, 128, 245102.
  • Archie, J., and Karplus, K. (2009), “Applying Undertaker Cost Functions to Model Quality Assessment,” Proteins, 75, 550–555.
  • Benkert, P., Tosatto, S. C.E., and Schomburg, D. (2008), “QMEAN: A Comprehensive Scoring Function for Model Quality Assessment,” Proteins, 71, 261–277.
  • Berkholz, D.S., Shapovalov, M.V., Jr, R.L., and Karplus, P.A. (2009), “Conformation Dependence of Backbone Geometry in Proteins,” Structure, 17, 1316–1325.
  • Bhuyan, M. S.I., and Gao, X. (2011), “A Protein-Dependent Side-Chain Rotamer Library,” BMC Bioinformatics, 12, S10.
  • Boomsma, W., Mardia, K.V., Taylor, C.C., Ferkinghoff-Borg, J., Krogh, A., and Hamelryck, T. (2008), “A Generative, Probabilistic Model of Local Protein Structure,” Proceedings of the National Academy of Science of the United States of America, 105, 8932–8937.
  • Buck, M., Bouguet-Bonnet, S., Pastor, R.W., and Jr, A.D. (2006), “Importance of the CMAP Correction to the CHARMM22 Protein Force Field: Dynamics of Hen Lysozyme,” Biophysical Journal, 90, L36–L38.
  • Bystroff, C., Thorsson, V., and Baker, D. (2000), “HMMSTR: A Hidden Markov Model for Local Sequence-Structure Correlations in Proteins,” Journal of Molecular Biology, 301, 173–190.
  • Chacón, J., and Duong, T. (2010), “Multivariate Plug-in Bandwidth Selection With Unconstrained Pilot Bandwidth Matrices,” TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, 19, 375–398.
  • ——— (2011), “Unconstrained Pilot Selectors for Smoothed Cross-Validation,” Australian & New Zealand Journal of Statistics, 53, 331–351.
  • Challis, C.J., and Schmidler, S.C. (2012), “A Stochastic Evolutionary Model for Protein Structure Alignment and Phylogeny,” Molecular Biology and Evolution, 29, 3575–3587.
  • Cozzetto, D., Kryshtafovych, A., and Tramontano, A. (2009), “Evaluation of CASP8 Model Quality Predictions,” Proteins, 77, 157–166.
  • Dahl, D.B., Bohannan, Z., Mo, Q., Vannucci, M., and Tsai, J.W. (2008), “Assessing Side-chain Perturbations of the Protein Backbone: A Knowledge Based Classification of Residue Ramachandran Space,” Journal of Molecular Biology, 378, 749–758.
  • DasGupta, A. (2011), Probability for Statistics and Machine Learning: Fundamentals and Advanced Topics (Springer Texts in Statistics), New York: Springer.
  • Davis, I.W., Murray, L.W., Richardson, J.S., and Richardson, D.C. (2004), “MOLPROBITY: Structure Validation and All-Atom Contact Analysis for Nucleic Acids and Their Complexes,” Nucleic Acids Research, 32, W615–W619.
  • Duong, T. (2007), “ks: Kernel Density Estimation and Kernel Discriminant Analysis for Multivariate Data in R,” Journal of Statistical Software, 21, 1–16.
  • Fetrow, J.S. (1995), “Omega Loops: Nonregular Secondary Structures Significant in Protein Function and Stability,” The FASEB Journal, 9, 708–717.
  • Gao, X., Xu, J., Li, S.C., and Li, M. (2009), “Predicting Local Quality of a Sequence-Structure Alignment,” Journal of Bioinformatics and Computational Biology, 7, 789–810.
  • Green, P., and Silverman, B. (1994), Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, New York: Chapman & Hall/CRC.
  • Gu, C. (1993), “Smoothing Spline Density Estimation: A Dimensionless Automatic Algorithm,” Journal of the American Statistical Association, 88, 495–504.
  • ——— (2002), Smoothing Spline ANOVA Models (Springer Series in Statistics), New York: Springer.
  • Hamelryck, T., Kent, J.T., and Krogh, A. (2006), “Sampling Realistic Protein Conformations Using Local Structural Bias,” PLoS Computational Biology, 2, e131.
  • Hansen, M., Kooperberg, C., and Sardy, S. (1998), “Triogram Models,” Journal of the American Statistical Association, 93, 101–119.
  • Hooft, R.W., Sander, C., and Vriend, G. (1997), “Objectively Judging the Quality of a Protein Structure From a Ramachandran Plot,” Computer Applications in the Biosciences: CABIOS, 13, 425–430.
  • Hovmöller, S., Zhou, T., and Ohlson, T. (2002), “Conformations of Amino Acids in Proteins,” Acta Crystallographica D Biological Crystallography, 58, 768–776.
  • Jacobson, M.P., Pincus, D.L., Rapp, C.S., Day, T.J., Honig, B., Shaw, D.E., and Friesner, R.A. (2004), “A Hierarchical Approach to All-Atom Protein Loop Prediction,” Proteins: Structure, Function, and Bioinformatics, 55, 351–367.
  • Jammalamadaka, S., and SenGupta, A. (2001), Topics in Circular Statistics (Series on Multivariate Analysis), Singapore: World Scientific.
  • Jha, A., Colubri, A., Zaman, M., Koide, S., Sosnick, T., and Freed, K. (2005), “Helix, Sheet, and Polyproline II Frequencies and Strong Nearest Neighbor Effects in a Restricted Coil Library,” Biochemistry, 44, 9691–9702.
  • Jolliffe, I. (2002), Principal Component Analysis (Springer Series in Statistics), New York: Springer.
  • Jones, M.C., Marron, J.S., and Park, B.U. (1991), “A Simple Root n Bandwidth Selector,” The Annals of Statistics, 19, 1919–1932.
  • Källberg, M., Margaryan, G., Wang, S., Ma, J., and Xu, J. (2014), “RaptorX Server: A Resource for Template-Based Protein Structure Modeling,” Methods in Molecular Biology, 1137, 17–27.
  • Kendrew, J.C., Dickerson, R.E., Strandberg, B.E., Hart, R.G., Davies, D.R., Phillips, D.C., and Shore, V.C. (1960), “Structure of Myoglobin: A Three-Dimensional Fourier Synthesis at 2 A. Resolution,” Nature, 185, 422–427.
  • Keskin, O., Yuret, D., Gursoy, A., Turkay, M., and Erman, B. (2004), “Relationships Between Amino Acid Sequence and Backbone Torsion Angle Preferences,” Proteins, 55, 992–998.
  • Kleywegt, G.J., Harris, M.R., Zou, J.-Y., Taylor, T.C., Wälby, A., and Jones, T.A. (2004), “The Uppsala Electron-Density Server,” Acta Crystallographica Section D, 60, 2240–2249.
  • Kryshtafovych, A., Barbato, A., Fidelis, K., Monastyrskyy, B., Schwede, T., and Tramontano, A. (2014), “Assessment of the Assessment: Evaluation of the Model Quality Estimates in CASP10,” Proteins, 82, 112–126.
  • Kryshtafovych, A., Fidelis, K., and Tramontano, A. (2011), “Evaluation of Model Quality Predictions in CASP9,” Proteins, 79, 91–106.
  • Lai, M., and Schumaker, L. (2007), Spline Functions on Triangulations (no. v. 13 in Encyclopedia of Mathematics and Its Applications), New York: Cambridge University Press.
  • Laskowski, R., MacArthur, M., Moss, D., and Thornton, J. (1993), “PROCHECK: A Program to Check the Stereochemical Quality of Protein Structures,” Journal of Applied Crystallography, 26, 283–291.
  • Lennox, K.P., Dahl, D.B., Vannucci, M., Day, R., and Tsai, J.W. (2010), “A Dirichlet Process Mixture of Hidden Markov Models for Protein Structure Prediction,” Annals of Applied Statistics, 4, 916–942.
  • Lennox, K.P., Dahl, D.B., Vannucci, M., and Tsai, J.W. (2009), “Density Estimation for Protein Conformation Angles Using a Bivariate von Mises Distribution and Bayesian Nonparametrics,” Journal of the American Statistical Association, 104, 586–596.
  • Maadooliat, M., Gao, X., and Huang, J.Z. (2013a), “Assessing Protein Conformational Sampling Methods Based on Bivariate Lag-Distributions of Backbone Angles,” Briefings in Bioinformatics, 14, 724–736.
  • ——— (2013b), “Assessing Protein Conformational Sampling Methods Based on Bivariate Lag-Distributions of Backbone Angles,” Briefings in Bioinformatics, 14, 724–736.
  • Mandell, D.J., Coutsias, E.A., and Kortemme, T. (2009), “Sub-Angstrom Accuracy in Protein Loop Reconstruction by Robotics-Inspired Conformational Sampling,” Nature Methods, 6, 551–552.
  • Mardia, K.V. (1975), “Statistics of Directional Data,” Journal of the Royal Statistical Society, 37, 349–393.
  • Mardia, K.V., Taylor, C.C., and Subramaniam, G.K. (2007), “Protein Bioinformatics and Mixtures of Bivariate Von Mises Distributions for Angular Data,” Biometrics, 63, 505–512.
  • Miao, X., Waddell, P.J., and Valafar, H. (2008), “TALI: Local Alignment of Protein Structures Using Backbone Torsion Angles,” Journal of Bioinformatics and Computational Biology, 6, 163–181.
  • Mu, Y., Nguyen, P.H., and Stock, G. (2005), “Energy Landscape of a Small Peptide Revealed by Dihedral Angle Principal Component Analysis,” Proteins, 58, 45–52.
  • Oldfield, T.J., and Hubbard, R.E. (1994), “Analysis of Cα Geometry in Protein Structures,” Proteins, 18, 324–337.
  • O’Sullivan, F. (1988), “Fast Computation of Fully Automated Log-Density and Log-Hazard Estimators,” SIAM Journal on Scientific and Statistical Computing, 9, 363–379.
  • Pertsemlidis, A., Zelinka, J., Fondon, J.W., Henderson, R.K., and Otwinowski, Z. (2005), “Bayesian Statistical Studies of the Ramachandran Distribution,” Statistical Applications in Genetics and Molecular Biology, 4, 1–18.
  • Qiu, J., Sheffler, W., Baker, D., and Noble, W.S. (2008), “Ranking Predicted Protein Structures With Support Vector Regression,” Proteins, 71, 1175–1182.
  • Ramachandran, G.N., Ramakrishnan, C., and Sasisekharan, V. (1963), “Stereochemistry of Polypeptide Chain Configurations,” Journal of Molecular Biology, 7, 95–99.
  • Ramachandran, G., and Sasisekharan, V. (1968), “Conformations of Polypeptides and Proteins,” Advances in Protein Chemistry, 23, 283–438.
  • Riccardi, L., Nguyen, P.H., and Stock, G. (2009), “Free-Energy Landscape of RNA Hairpins Constructed via Dihedral Angle Principal Component Analysis,” The Journal of Physical Chemistry B, 113, 16660–16668.
  • Rivest, L.P. (1988), “A Distribution for Dependent Unit Vectors,” Communications in Statistics: Theory and Methods, 17, 461–483.
  • Rohl, C.A., Strauss, C. E.M., Misura, K. M.S., and Baker, D. (2004), “Protein Structure Prediction Using Rosetta,” Methods in Enzymology, 383, 66–93.
  • Sain, S.R., Baggerly, K.A., and Scott, D.W. (1994), “Cross-Validation of Multivariate Densities,” Journal of the American Statistical Association, 89, 807–817.
  • Scott, D. (1992), Multivariate Density Estimation: Theory, Practice, and Visualization (Wiley Series in Probability and Statistics), New York: Wiley.
  • Sellers, B.D., Zhu, K., Zhao, S., Friesner, R.A., and Jacobson, M.P. (2008), “Toward Better Refinement of Comparative Models: Predicting Loops in Inexact Environments,” Proteins: Structure, Function, and Bioinformatics, 72, 959–971.
  • Shapovalov, M.V., and Jr, R.L. (2011), “A Smoothed Backbone-Dependent Rotamer Library for Proteins Derived From Adaptive Kernel Density Estimates and Regressions,” Structure, 19, 844–858.
  • Silverman, B. (1986), Density Estimation for Statistics and Data Analysis (Vol. 111), New York: Chapman & Hall/CRC.
  • Simons, K.T., Bonneau, R., Ruczinski, I., and Baker, D. (1999), “Ab Initio Protein Structure Prediction of CASP III Targets Using Rosetta,” Proteins, 37, 171–176.
  • Singh, H., Hnizdo, V., and Demchuk, E. (2002), “Probabilistic Model for Two Dependent Circular Variables,” Biometrika, 89, 719–723.
  • Stein, A., and Kortemme, T. (2013), “Improvements to Robotics-Inspired Conformational Sampling in Rosetta,” PloS One, 8, e63090.
  • Stone, C. (1990), “Large-Sample Inference for Log-Spline Models,” The Annals of Statistics, 18, 717–741.
  • Ting, D., Wang, G., Shapovalov, M., Mitra, R., Jordan, M.I., and Jr, R.L. (2010), “Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed From a Hierarchical Dirichlet Process Model,” PLOS Computational Biology, 6, e1000763.
  • Tuffery, P., and Derreumaux, P. (2005), “Dependency Between Consecutive Local Conformations Helps Assemble Protein Structures From Secondary Structures Using Go Potential and Greedy Algorithm,” Proteins, 61, 732–740.
  • Venables, W.N., and Ripley, B.D. (2002), Modern Applied Statistics With S (4th ed.), New York: Springer.
  • Wahba, G. (1990), Spline Models for Observational Data, Philadelphia, PA: SIAM.
  • Wand, P., and Jones, C. (1995), Kernel Smoothing (Monographs on Statistics and Applied Probability), New York: Taylor & Francis.
  • Zhao, F., Peng, J., Debartolo, J., Freed, K.F., Sosnick, T.R., and Xu, J. (2010), “A Probabilistic and Continuous Model of Protein Conformational Space for Template-Free Modeling,” Journal of Computational Biology, 17, 783–798.
  • Zhu, K., Pincus, D.L., Zhao, S., and Friesner, R.A. (2006), “Long Loop Prediction Using the Protein Local Optimization Program,” Proteins: Structure, Function, and Bioinformatics, 65, 438–452.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.