Evaluation of the structural quality of modeled proteins by using globularity criteria

Springer Science and Business Media LLC - Tập 7 - Trang 1-12 - 2007
Susan Costantini1,2,3, Angelo M Facchiano1,3, Giovanni Colonna1,2
1CRISCEB (Research Center of Computational and Biotechnological Sciences), Second University of Naples, Naples, Italy
2Department of Biochemistry and Biophysics, Second University of Naples, Naples, Italy
3Laboratory of Bioinformatics and Computational Biology, Institute of Food Science, CNR, Avellino, Italy

Tóm tắt

The knowledge of the three-dimensional structure of globular proteins is fundamental for a detailed investigation of their functional properties. Experimental methods are too slow for structure investigation on a large scale, while computational prediction methods offer alternatives that are continuously being improved. The international Comparative Assessment of Structure Prediction (CASP), an "a posteriori" evaluation of the quality of theoretical models when the experimental structure becomes available, demonstrates that predictions can be successful as well as unsuccessful, and this suggests the necessity for evaluations able to discard "a priori" the wrong models. We analyzed different structural properties of globular proteins for experimentally solved proteins belonging to the four different structural classes: "mainly alpha", "mainly beta", "alpha/beta" and "alpha+beta". The properties were found to be linearly correlated to protein molecular weight, but with some differences among the four classes. These results were applied to develop an evaluation test of theoretical models based on the expected globular properties of proteins. To verify the success of our test, we applied it to several protein models submitted to the sixth edition of CASP. The best theoretical models, as judged by CASP assessors, were in agreement with the expected properties, while most of the low-quality models had not passed our evaluations. This study supports the need for careful checks to avoid the diffusion of incorrect structural models. Our test allows the evaluation of models in the absence of experimental reference structures, thereby preventing the diffusion of incorrect structural models and the formulation of incorrect functional hypotheses. It can be used to check the globularity of predicted models, and to supplement other methods already used to evaluate their quality.

Tài liệu tham khảo

Grigoryan G, Zhou F, Lusting SR, Ceder G, Morgan D, Keating AE: Ultra-fast evaluation of protein energies directly from sequence. PLOS Computational Biology 2006, 2: 551–563. 10.1371/journal.pcbi.0020063 Pace CN, Trevino S, Prabhakaran E, Scholtz JM: Protein structure, stability and solubility in water and other solvents. Phil Trans R Soc Lond B 2004, 359: 1225–1235. 10.1098/rstb.2004.1500 Stickle DF, Presta LG, Dill KA, Rose GD: Hydrogen bonding in globular proteins. J Mol Biol 1992, 226: 1143–1159. 10.1016/0022-2836(92)91058-W Pauling L, Corey RB, Branson HR: The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chains. Proc Nat Acad Sci 1951, 37: 205–211. 10.1073/pnas.37.4.205 Perutz MF: New X-ray evidence on the configuration of polypeptide chains. Nature 1951, 167: 1053–1054. 10.1038/1671053a0 Fersht AR: The hydrogen bond in molecular recognition. Trends Biochem Sci 1987, 12: 301–304. 10.1016/0968-0004(87)90146-0 Hubbard SJ, Argos P: Evidence on close packing and cavities in proteins. Current Opinion in Biotechnology 1995, 6: 375–381. 10.1016/0958-1669(95)80065-4 Hubbard SJ, Gross K-H, Argos P: Intramolecular cavities in globular proteins. Protein Eng 1994, 7: 613–626. 10.1093/protein/7.5.613 Fleming PJ, Richards FM: Protein Packing: Dependence on protein size, secondary structure and amino acid composition. J Mol Biol 2000, 299: 487–498. 10.1006/jmbi.2000.3750 Linding R, Russell RB, Neduva V, Gibson TJ: GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Research 2003, 31: 3701–3708. 10.1093/nar/gkg519 Rost B, Yachdav G, Liu J: The PredictProtein server. Nucleic Acids Research 2004, 32: W321-W326. 10.1093/nar/gkh377 Hobohm U, Scharf M, Schneider R, Sander C: Selection of a representative set of structures from the Brookhaven Protein Data Bank Protein. Protein Science 1992, 1: 409–417. Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22: 2577–2637. 10.1002/bip.360221211 Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995, 247: 536–540. 10.1006/jmbi.1995.0159 Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 2004, 32: D226-D229. 10.1093/nar/gkh039 16. Tress M, Tai C-H, Wang G, Ezkurdia I, Lopez G, Valencia A, Lee B-K, Dunbrack RL Jr: Domain definition and target classification for CASP6. Proteins 2005, (Suppl 7):8–18. 10.1002/prot.20717 17. Vincent JJ, Tai C-H, Sathyanarayana BK, Lee B: Assessment of CASP6 predictions for new and nearly new fold targets. Proteins 2005, (Suppl 7):67–83. 10.1002/prot.20722 Cuff AL, Martin ACR: Analysis of void volumes in proteins and application to stability of the p53 tumour suppressor protein. J Mol Biol 2004, 344: 1199–1209. 10.1016/j.jmb.2004.10.015 Tsai J, Taylor R, Chothia C, Gerstein M: The packing density in proteins: standard radii and volumes. J Mol Biol 1999, 290: 253–266. 10.1006/jmbi.1999.2829 Liang J, Edelsbrunner H, Fu P, Sudhakar PV, Subramaniam S: Analytical shape computation of macromolecules: II. Inaccessible cavities in proteins. Proteins 1998, 33: 18–29. 10.1002/(SICI)1097-0134(19981001)33:1<18::AID-PROT2>3.0.CO;2-H 21. Moult J, Fidelis K, Rost B, Hubbard T, Tramontano A: Critical assessment of methods of protein structure prediction (CASP) – round 6. Proteins 2005, (Suppl 7):3–7. 10.1002/prot.20716 Siew N, Elofsson A, Rychlewski L, Fischer D: MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 2001, 16: 776–785. 10.1093/bioinformatics/16.9.776 Sippl MJ: Recognition of errors in three-dimensional structures of proteins. Proteins 1993, 17: 355–362. 10.1002/prot.340170404 Pettitt CS, McGuffin LJ, Jones DT: Improving sequence-based fold recognition by using 3D model quality assessment. Bioinformatics 2005, 21: 3509–3515. 10.1093/bioinformatics/bti540 Melo F, Feytmans E: Novel knowledge-based mean force potential at atomic level. J Mol Biol 1997, 267: 207–222. 10.1006/jmbi.1996.0868 Melo F, Feytmans E: Assessment of protein structures based on the non-local energy. J Mol Biol 1998, 277: 1141–1152. 10.1006/jmbi.1998.1665 Tosatto SC: The victor/FRST function for model quality estimation. J Comput Biol 2005, 12: 1316–1327. 10.1089/cmb.2005.12.1316 Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK – A program to check the stereochemical quality of protein structures. J Appl Cryst 1993, 26: 283–291. 10.1107/S0021889892009944 Sims GE, Kim S-H: A method for evaluating the structural quality of protein models by using higher-order phi-psi pairs scoring. PNAS 2006, 103: 4428–4432. 10.1073/pnas.0511333103 Rost B, Eyrich VA: EVA: large-scale analysis of secondary structure prediction. Proteins 2001, (Suppl 5):192–199. 10.1002/prot.10051 Levitt M, Chothia C: Structural pattern in globular proteins. Nature 1976, 261: 552–557. 10.1038/261552a0 McDonald IK, Thornton JM: Satisfying hydrogen bonding potential in proteins. J Mol Biol 1994, 238: 777–793. 10.1006/jmbi.1994.1334 Hubbard SJ, Campbell SF, Thornton JM: Molecular recognition. Conformational analysis of limited proteolytic sites and serine proteinase protein inhibitors. J Mol Biol 1991, 220: 507–530. 10.1016/0022-2836(91)90027-4 Globularity of Proteins[http://bioinformatica.isa.cnr.it/GLOBULARITY/]