A knowledge-based structure-discriminating function that requires only main-chain atom coordinates
Tóm tắt
The use of knowledge-based potential function is a powerful method for protein structure evaluation. A variety of formulations that evaluate single or multiple structural features of proteins have been developed and studied. The performance of functions is often evaluated by discrimination ability using decoy structures of target proteins. A function that can evaluate coarse-grained structures is advantageous from many aspects, such as relatively easy generation and manipulation of model structures; however, the reduction of structural representation is often accompanied by degradation of the structure discrimination performance. We developed a knowledge-based pseudo-energy calculating function for protein structure discrimination. The function (Discriminating Function using Main-chain Atom Coordinates, DFMAC) consists of six pseudo-energy calculation components that deal with different structural features. Only the main-chain atom coordinates of N, C
α
, and C atoms for the respective amino acid residues are required as input data for structure evaluation. The 231 target structures in 12 different types of decoy sets were separated into 154 and 77 targets, and function training and the subsequent performance test were performed using the respective target sets. Fifty-nine (76.6%) native and 68 (88.3%) near-native (< 2.0 Å C
α
RMSD) targets in the test set were successfully identified. The average C
α
RMSD of the test set resulted in 1.174 with the tuned parameters. The major part of the discrimination performance was supported by the orientation-dependent component. Despite the reduced representation of input structures, DFMAC showed considerable structure discrimination ability. The function can be applied to the identification of near-native structures in structure prediction experiments.
Tài liệu tham khảo
Poole AM, Ranganathan R: Knowledge-based potentials in protein design. Curr Opin Struct Biol 2006, 16: 508–513. 10.1016/j.sbi.2006.06.013
Boas FE, Harbury PB: Potential energy functions for protein design. Curr Opin Struct Biol 2007, 17: 199–204. 10.1016/j.sbi.2007.03.006
Gordon DB, Marshall SA, Mayo SL: Energy functions for protein design. Curr Opin Struct Biol 1999, 9: 509–513. 10.1016/S0959-440X(99)80072-4
Zhou Y, Zhou H, Zhang C, Liu S: What is a desirable statistical energy function for proteins and how can it be obtained? Cell Biochem Biophys 2006, 46: 165–174. 10.1385/CBB:46:2:165
Sippl MJ: Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J Mol Biol 1990, 213: 859–883. 10.1016/S0022-2836(05)80269-4
Jones DT: GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 1999, 287: 797–815. 10.1006/jmbi.1999.2583
Zhou H, Zhou Y: Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 2002, 11: 2714–2726. 10.1110/ps.0217002
Kortemme T, Morozov AV, Baker D: An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. J Mol Biol 2003, 326: 1239–1259. 10.1016/S0022-2836(03)00021-4
Buchete NV, Straub JE, Thirumalai D: Continuous anisotropic representation of coarse-grained potentials for proteins by spherical harmonics synthesis. J Mol Graph Model 2004, 22: 441–450. 10.1016/j.jmgm.2003.12.010
Buchete N-V, Straub JE, Thirumalai D: Orientational potentials extracted from protein structures improve native fold recognition. Protein Sci 2004, 13: 862–874. 10.1110/ps.03488704
Chen Y, Kortemme T, Robertson T, Baker D, Varani G: A new hydrogen-bonding potential for the design of protein-RNA interactions predicts specific, contacts and discriminates decoys. Nucleic Acids Res 2004, 32: 5147–5162. 10.1093/nar/gkh785
Wang K, Fain B, Levitt M, Samudrala R: Improved protein structure selection using decoy-dependent discriminatory functions. BMC Struct Biol 2004, 4: 8. 10.1186/1472-6807-4-8
Zhang C, Liu S, Zhou H, Zhou Y: An accurate, residue-level, pair potential of mean force for folding and binding based on the distance-scaled, ideal-gas reference state. Protein Sci 2004, 13: 400–411. 10.1110/ps.03348304
Tosatto SC: The victor/FRST function for model quality estimation. J Comput Biol 2005, 12: 1316–1327. 10.1089/cmb.2005.12.1316
Shen MY, Sali A: Statistical potential for assessment and prediction of protein structures. Protein Sci 2006, 15: 2507–2524. 10.1110/ps.062416606
Fogolari F, Pieri L, Dovier A, Bortolussi L, Giugliarelli G, Corazza A, Esposito G, Viglino P: Scoring predictive models using a reduced representation of proteins: model and energy definition. BMC Struct Biol 2007, 7: 15. 10.1186/1472-6807-7-15
Liu T, Samudrala R: The effect of experimental resolution on the performance of knowledge-based discriminatory functions for protein structure selection. Protein Eng Des Sel 2006, 19: 431–437. 10.1093/protein/gzl027
Samudrala R, Levitt M: Decoys 'R' Us: A database of incorrect protein conformations to improve protein structure prediction. Protein Sci 2000, 9: 1399–1401.
Tsai J, Bonneau R, Morozov AV, Kuhlman B, Rohl CA, Baker D: An improved protein decoy set for testing energy functions for protein structure prediction. Proteins 2003, 53: 76–87. 10.1002/prot.10454
John B, Sali A: Comparative protein structure modeling by iterative alignment, model building, and model assessment. Nucleic Acids Res 2003, 31: 3982–3992. 10.1093/nar/gkg460
Samudrala R, Moult J: An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 1998, 275: 895–916. 10.1006/jmbi.1997.1479
Wang G, Dunbrack RL Jr: PISCES: a protein sequence culling server. Bioinformatics 2003, 19: 1589–1591. 10.1093/bioinformatics/btg224
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235