Statistical potential for assessment and prediction of protein structures

Protein Science - Tập 15 Số 11 - Trang 2507-2524 - 2006
Min‐Yi Shen1, Andrej S̆ali2
1Department of Biopharmaceutical Sciences, Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, California 94158, USA.
2Department of Biopharmaceutical Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biomedical Research, University of California at San Francisco, San Francisco, California 94158, USA

Tóm tắt

Abstract

Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance‐dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non‐native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo‐electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER‐8.

Từ khóa


Tài liệu tham khảo

10.1103/PhysRevLett.12.317

10.1042/bj1280737

10.1126/science.181.4096.223

10.1073/pnas.97.8.3977

10.1002/prot.340180306

10.1063/1.474725

10.1016/j.jmb.2004.06.091

10.1110/ps.8.2.361

10.1126/science.1853201

Brooks C.L., 1988, Proteins: A theoretical perspective of dynamics, structure, and thermodynamics, xiii

10.1002/prot.340160110

10.1002/prot.340210302

10.1016/j.sbi.2004.03.002

10.1110/ps.03488704

10.1016/0022-2836(92)90556-Y

10.1110/ps.051440705

10.1002/1097-0134(20001101)41:2<157::AID-PROT10>3.0.CO;2-W

10.1002/pro.5560020916

10.1016/j.jmb.2006.08.035

10.1093/protein/9.8.637

10.1529/biophysj.105.079434

10.5802/afst.311

10.1111/j.1538-4632.1977.tb00587.x

10.1021/bi00327a032

10.1074/jbc.272.2.701

10.1002/(SICI)1521-3773(19980420)37:7<868::AID-ANIE868>3.0.CO;2-H

10.1110/ps.062095806

10.1002/prot.20482

10.1002/prot.340230204

10.1110/ps.9.9.1753

10.1002/(SICI)1097-0134(19980501)31:2<139::AID-PROT4>3.0.CO;2-H

10.1088/0305-4470/38/16/001

10.1002/1097-0134(20001201)41:4<518::AID-PROT90>3.0.CO;2-6

10.1006/jmbi.1996.0226

10.1006/jmbi.1997.1237

10.1214/aoms/1177729805

10.1016/S0022-2836(05)80068-3

Hill T.L., 1956, Statistical mechanics: Principles and selected applications, 432

10.1006/jmbi.1995.0529

10.1016/S0959-440X(96)80075-3

10.1093/nar/gkg460

10.1006/jmbi.1999.3091

10.1006/jmbi.1999.2583

10.1016/S0959-440X(96)80076-5

10.1016/S0022-2836(03)00323-1

10.1063/1.1749657

10.1006/jmbi.1994.1109

10.1002/(SICI)1097-0134(19991201)37:4<592::AID-PROT10>3.0.CO;2-2

10.1093/nar/gkj120

10.1016/S0959-440X(00)00063-4

10.1214/aoms/1177728669

10.1002/prot.1087

10.1016/0022-2836(92)90228-C

McQuarrie D.A., 1975, Statistical mechanics, xiv

10.1006/jmbi.1996.0868

10.1002/pro.110430

10.1073/pnas.0509355103

10.1021/ma00145a039

10.1006/jmbi.1996.0114

10.1002/(SICI)1097-0134(19990815)36:3<357::AID-PROT10>3.0.CO;2-U

10.1093/protein/13.7.459

10.1016/S0959-440X(97)80025-5

10.1016/S1359-0278(97)00063-1

10.1021/cr60044a006

10.1006/jmbi.1993.1433

10.1006/jmbi.2000.3541

10.1021/jp982255t

10.1006/jmbi.1996.0256

10.1006/jmbi.1996.0809

10.1093/nar/gkj059

Press W.H., 1992, Numerical recipes in FORTRAN: The art of scientific computing, xxvi

10.1002/prot.20585

10.1103/PhysRevLett.12.575

10.1093/protein/10.8.865

10.1002/(SICI)1097-0134(19990701)36:1<54::AID-PROT5>3.0.CO;2-B

10.1046/j.1432-1327.1998.2540135.x

10.1093/protein/8.9.849

10.1006/jmbi.1993.1626

10.1006/jmbi.1997.1479

10.1021/cr040425u

10.1016/j.cplett.2005.02.029

10.1006/jmbi.1997.0959

10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

10.1016/S0022-2836(05)80269-4

10.1007/BF02337562

10.1002/prot.340170404

10.1002/prot.340130308

10.1002/pro.5560060317

10.1002/(SICI)1097-0134(20000101)38:1<3::AID-PROT2>3.0.CO;2-S

10.1016/j.jmb.2005.07.054

10.1002/pro.5560020508

10.1021/ma60054a013

10.1073/pnas.93.21.11628

10.1006/jmbi.1996.0175

10.1002/1097-0134(20001001)41:1<40::AID-PROT70>3.0.CO;2-U

10.1002/(SICI)1097-0134(20000701)40:1<71::AID-PROT90>3.0.CO;2-3

10.1016/j.sbi.2005.08.001

10.1016/j.jmb.2006.01.062

10.1088/0305-4470/35/31/303

10.1016/S0959-440X(97)80029-2

10.1002/(SICI)1097-0134(20000201)38:2<134::AID-PROT3>3.0.CO;2-A

10.1093/bioinformatics/btg224

10.1186/1472-6807-4-8

10.1016/j.str.2006.04.002

10.1073/pnas.77.4.1736

10.1006/jmbi.2000.3835

10.1529/biophysj.103.035998

10.1110/ps.0217002

10.1002/pro.122121