Modeling of loops in protein structures

Protein Science - Tập 9 Số 9 - Trang 1753-1773 - 2000
András Fiser1, Richard Kinh Gian1, Andrej Šali1
1Laboratories of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, 1230 York Ave., New York, New York 10021

Tóm tắt

Abstract

Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM‐22 force field, statistical preferences for the main‐chain and side‐chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30% of 4–, 8–, and 12–residue loop predictions, respectively, had <2 Å RMSD error for the mainchain N, Ca, C, and O atoms; the average accuracies were 0.59 6 0.05, 1.16 6 0.10, and 2.61 6 0.16 Å, respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main‐chain stem atoms is 2.5 Å, the average loop prediction error increased by 180, 25, and 3% for 4–, 8–, and 12–residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.

Từ khóa


Tài liệu tham khảo

10.1002/(SICI)1097-0134(1997)1 <29::AID-PROT5>3.0.CO;2-J

10.1006/jmbi.1994.1052

Abola EE, 1987, Crystallographic data‐bases–Information, content, software systems, scientific applications, 107

10.1002/(SICI)1097-0134(199602)24:2<152::AID-PROT2>3.0.CO;2-L

10.1006/jmbi.1993.1105

10.1038/326347a0

10.1002/jcc.540040211

10.1002/bip.360330302

10.1016/0022-2836(69)90487-2

10.1002/bip.360291415

10.1080/08927029308022163

10.1038/335564a0

10.1002/bip.360260114

10.1002/bip.360330812

10.1002/(SICI)1096-987X(199606)17:8<1002::AID-JCC9>3.0.CO;2-Y

10.1063/1.456500

10.1002/(SICI)1096-987X(199609)17:12<1453::AID-JCC6>3.0.CO;2-J

10.1016/0022-2836(87)90412-8

10.1126/science.3090684

10.1038/342877a0

10.1016/S0969-2126(96)00119-0

10.1093/protein/2.5.335

10.1002/pro.5560021213

10.1021/bi00349a037

10.1002/pro.5560020915

10.1021/ja00124a002

10.1002/(SICI)1097-0134(20000701)40:1<135::AID-PROT150>3.0.CO;2-1

10.1007/PL00010718

10.1021/jp984440c

10.1002/pro.5560051223

10.1002/(SICI)1096-987X(19980415)19:5<548::AID-JCC7>3.0.CO;2-M

10.1002/jcc.540110115

10.1002/pro.5560040618

10.1006/jmbi.1995.0540

10.1006/jmbi.1998.2061

10.1093/protein/7.8.953

10.1002/prot.340010408

10.1093/protein/5.7.617

10.1002/pro.5560021104

10.1006/jmbi.1996.0771

10.1016/0065-227X(84)90010-8

10.1073/pnas.77.6.3393

10.1016/S0092-8674(00)80417-1

10.1002/bip.360320106

10.1006/jmbi.1998.1884

10.1002/prot.340230413

10.1006/jmbi.1999.2583

10.1006/jmbi.1997.1233

10.1006/jmbi.1999.2659

10.1002/(SICI)1097-0134(1999)37:3 <30::AID-PROT6>3.0.CO;2-S

10.1002/j.1460-2075.1986.tb04287.x

10.1002/bip.360221211

10.1006/jmbi.1999.2581

10.1016/S1074-5521(97)90073-9

10.1073/pnas.92.21.9886

10.1093/protein/12.1.11

10.1006/jmbi.1998.2393

10.1038/nsb0295-163

10.1006/jmbi.1996.0363

10.1002/jcc.540100603

10.1002/jcc.540100604

10.1002/jcc.540100605

10.1093/protein/7.10.1175

10.1016/S0022-2836(83)80306-4

10.1016/0022-2836(92)90964-L

10.1016/S0959-440X(97)80112-1

10.1038/356083a0

10.1021/jp973084f

10.1038/nbt0396-323

10.1146/annurev.biophys.29.1.291

10.1002/(SICI)1097-0134(1997)1 <14::AID-PROT4>3.0.CO;2-O

10.1073/pnas.86.23.9268

10.1006/jmbi.1996.0617

10.1002/prot.340140409

10.1006/jmbi.1994.1332

10.1002/jcc.540141115

10.1006/jmbi.1996.0868

10.1093/protein/11.6.411

10.1002/prot.340230305

10.1002/(SICI)1097-0134(1999)37:3 <2::AID-PROT2>3.0.CO;2-2

10.1002/prot.340010207

10.1006/jmbi.1999.3440

10.1006/jmbi.1996.0819

10.1016/0022-2836(92)91008-D

10.1016/S0006-3495(97)78266-3

10.1002/pro.5560040301

10.1016/0022-2836(87)90358-5

WH Press SA Teukolsky WT Vetterling BP Flannery 1992 Cambridge University Press Cambridge UK

10.1016/S0022-2836(63)80023-6

10.1093/protein/6.8.837

10.1002/(SICI)1097-0134(19990501)35:2<173::AID-PROT4>3.0.CO;2-2

10.1093/protein/8.4.389

10.1002/ijch.199400028

10.1016/0022-2836(92)90553-V

10.1073/pnas.90.8.3583

10.1016/0079-6107(85)90002-1

10.1002/pro.5560040316

10.1006/jmbi.1993.1607

10.1006/jmbi.1996.0851

10.1006/jmbi.1998.2043

10.1016/S1359-0278(98)00034-0

10.1038/4136

10.1006/jmbi.1993.1626

ŠaliA SánchezR BadretdinovAY FiserA MeloF OveringtonJP FeyfantE Marti RenomMA.1999. MODELLER a protein structure modeling program release 5. URLhttp://guitar.rockefeller.edu.

10.1006/jmbi.1998.1689

10.1002/(SICI)1097-0134(1997)1 <50::AID-PROT8>3.0.CO;2-S

10.1016/S0959-440X(97)80027-9

10.1073/pnas.95.23.13597

10.1093/bioinformatics/15.12.1060

10.1002/prot.340090107

10.1002/bip.360261207

10.1110/ps.8.5.1045

10.1016/0022-2836(89)90583-4

10.1016/S0022-2836(05)80269-4

10.1002/prot.340170404

10.1002/prot.340180205

10.1002/pro.5560040715

10.1016/S0022-2836(99)80016-3

10.1002/bip.360320105

10.1093/protein/10.2.159

10.1006/jmbi.1993.1018

10.1002/prot.340060405

10.1002/prot.340130306

10.1002/bip.360291408

10.1006/jmbi.1996.0857

10.1002/bip.360341211

10.1002/(SICI)1097-0134(19990701)36:1<1::AID-PROT1>3.0.CO;2-T

10.1016/S0969-2126(99)80085-9

10.1006/jmbi.1996.0020

10.1006/jmbi.1999.2826

10.1006/jmbi.1996.0052

10.1074/jbc.271.40.24711

10.1002/prot.340190407

10.1002/(SICI)1097-0134(199602)24:2<209::AID-PROT7>3.0.CO;2-D

10.1002/pro.5560030315

10.1002/pro.5560020806

10.1002/jcc.540140508