Evaluation of the structural quality of modeled proteins by using globularity criteria
Tóm tắt
The knowledge of the three-dimensional structure of globular proteins is fundamental for a detailed investigation of their functional properties. Experimental methods are too slow for structure investigation on a large scale, while computational prediction methods offer alternatives that are continuously being improved. The international Comparative Assessment of Structure Prediction (CASP), an "a posteriori" evaluation of the quality of theoretical models when the experimental structure becomes available, demonstrates that predictions can be successful as well as unsuccessful, and this suggests the necessity for evaluations able to discard "a priori" the wrong models. We analyzed different structural properties of globular proteins for experimentally solved proteins belonging to the four different structural classes: "mainly alpha", "mainly beta", "alpha/beta" and "alpha+beta". The properties were found to be linearly correlated to protein molecular weight, but with some differences among the four classes. These results were applied to develop an evaluation test of theoretical models based on the expected globular properties of proteins. To verify the success of our test, we applied it to several protein models submitted to the sixth edition of CASP. The best theoretical models, as judged by CASP assessors, were in agreement with the expected properties, while most of the low-quality models had not passed our evaluations. This study supports the need for careful checks to avoid the diffusion of incorrect structural models. Our test allows the evaluation of models in the absence of experimental reference structures, thereby preventing the diffusion of incorrect structural models and the formulation of incorrect functional hypotheses. It can be used to check the globularity of predicted models, and to supplement other methods already used to evaluate their quality.
Tài liệu tham khảo
Grigoryan G, Zhou F, Lusting SR, Ceder G, Morgan D, Keating AE: Ultra-fast evaluation of protein energies directly from sequence. PLOS Computational Biology 2006, 2: 551–563. 10.1371/journal.pcbi.0020063
Pace CN, Trevino S, Prabhakaran E, Scholtz JM: Protein structure, stability and solubility in water and other solvents. Phil Trans R Soc Lond B 2004, 359: 1225–1235. 10.1098/rstb.2004.1500
Stickle DF, Presta LG, Dill KA, Rose GD: Hydrogen bonding in globular proteins. J Mol Biol 1992, 226: 1143–1159. 10.1016/0022-2836(92)91058-W
Pauling L, Corey RB, Branson HR: The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chains. Proc Nat Acad Sci 1951, 37: 205–211. 10.1073/pnas.37.4.205
Perutz MF: New X-ray evidence on the configuration of polypeptide chains. Nature 1951, 167: 1053–1054. 10.1038/1671053a0
Fersht AR: The hydrogen bond in molecular recognition. Trends Biochem Sci 1987, 12: 301–304. 10.1016/0968-0004(87)90146-0
Hubbard SJ, Argos P: Evidence on close packing and cavities in proteins. Current Opinion in Biotechnology 1995, 6: 375–381. 10.1016/0958-1669(95)80065-4
Hubbard SJ, Gross K-H, Argos P: Intramolecular cavities in globular proteins. Protein Eng 1994, 7: 613–626. 10.1093/protein/7.5.613
Fleming PJ, Richards FM: Protein Packing: Dependence on protein size, secondary structure and amino acid composition. J Mol Biol 2000, 299: 487–498. 10.1006/jmbi.2000.3750
Linding R, Russell RB, Neduva V, Gibson TJ: GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Research 2003, 31: 3701–3708. 10.1093/nar/gkg519
Rost B, Yachdav G, Liu J: The PredictProtein server. Nucleic Acids Research 2004, 32: W321-W326. 10.1093/nar/gkh377
Hobohm U, Scharf M, Schneider R, Sander C: Selection of a representative set of structures from the Brookhaven Protein Data Bank Protein. Protein Science 1992, 1: 409–417.
Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577–2637. 10.1002/bip.360221211
Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995, 247: 536–540. 10.1006/jmbi.1995.0159
Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 2004, 32: D226-D229. 10.1093/nar/gkh039
16. Tress M, Tai C-H, Wang G, Ezkurdia I, Lopez G, Valencia A, Lee B-K, Dunbrack RL Jr: Domain definition and target classification for CASP6. Proteins 2005, (Suppl 7):8–18. 10.1002/prot.20717
17. Vincent JJ, Tai C-H, Sathyanarayana BK, Lee B: Assessment of CASP6 predictions for new and nearly new fold targets. Proteins 2005, (Suppl 7):67–83. 10.1002/prot.20722
Cuff AL, Martin ACR: Analysis of void volumes in proteins and application to stability of the p53 tumour suppressor protein. J Mol Biol 2004, 344: 1199–1209. 10.1016/j.jmb.2004.10.015
Tsai J, Taylor R, Chothia C, Gerstein M: The packing density in proteins: standard radii and volumes. J Mol Biol 1999, 290: 253–266. 10.1006/jmbi.1999.2829
Liang J, Edelsbrunner H, Fu P, Sudhakar PV, Subramaniam S: Analytical shape computation of macromolecules: II. Inaccessible cavities in proteins. Proteins 1998, 33: 18–29. 10.1002/(SICI)1097-0134(19981001)33:1<18::AID-PROT2>3.0.CO;2-H
21. Moult J, Fidelis K, Rost B, Hubbard T, Tramontano A: Critical assessment of methods of protein structure prediction (CASP) – round 6. Proteins 2005, (Suppl 7):3–7. 10.1002/prot.20716
Siew N, Elofsson A, Rychlewski L, Fischer D: MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 2001, 16: 776–785. 10.1093/bioinformatics/16.9.776
Sippl MJ: Recognition of errors in three-dimensional structures of proteins. Proteins 1993, 17: 355–362. 10.1002/prot.340170404
Pettitt CS, McGuffin LJ, Jones DT: Improving sequence-based fold recognition by using 3D model quality assessment. Bioinformatics 2005, 21: 3509–3515. 10.1093/bioinformatics/bti540
Melo F, Feytmans E: Novel knowledge-based mean force potential at atomic level. J Mol Biol 1997, 267: 207–222. 10.1006/jmbi.1996.0868
Melo F, Feytmans E: Assessment of protein structures based on the non-local energy. J Mol Biol 1998, 277: 1141–1152. 10.1006/jmbi.1998.1665
Tosatto SC: The victor/FRST function for model quality estimation. J Comput Biol 2005, 12: 1316–1327. 10.1089/cmb.2005.12.1316
Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK – A program to check the stereochemical quality of protein structures. J Appl Cryst 1993, 26: 283–291. 10.1107/S0021889892009944
Sims GE, Kim S-H: A method for evaluating the structural quality of protein models by using higher-order phi-psi pairs scoring. PNAS 2006, 103: 4428–4432. 10.1073/pnas.0511333103
Rost B, Eyrich VA: EVA: large-scale analysis of secondary structure prediction. Proteins 2001, (Suppl 5):192–199. 10.1002/prot.10051
Levitt M, Chothia C: Structural pattern in globular proteins. Nature 1976, 261: 552–557. 10.1038/261552a0
McDonald IK, Thornton JM: Satisfying hydrogen bonding potential in proteins. J Mol Biol 1994, 238: 777–793. 10.1006/jmbi.1994.1334
Hubbard SJ, Campbell SF, Thornton JM: Molecular recognition. Conformational analysis of limited proteolytic sites and serine proteinase protein inhibitors. J Mol Biol 1991, 220: 507–530. 10.1016/0022-2836(91)90027-4
Globularity of Proteins[http://bioinformatica.isa.cnr.it/GLOBULARITY/]