Exploiting sequence and structure homologs to identify protein–protein binding sites

Proteins: Structure, Function and Bioinformatics - Tập 62 Số 3 - Trang 630-640 - 2006
Jo‐Lan Chung1,2, Wei Wang1, Philip E. Bourne3,2
1Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California
2San Diego Supercomputer Center, University of California, San Diego, La Jolla, California
3Department of Pharmacology, University of California, San Diego, La Jolla, California

Tóm tắt

AbstractA rapid increase in the number of experimentally derived three‐dimensional structures provides an opportunity to better understand and subsequently predict protein–protein interactions. In this study, structurally conserved residues were derived from multiple structure alignments of the individual components of known complexes and the assigned conservation score was weighted based on the crystallographic B factor to account for the structural flexibility that will result in a poor alignment. Sequence profile and accessible surface area information was then combined with the conservation score to predict protein–protein binding sites using a Support Vector Machine (SVM). The incorporation of the conservation score significantly improved the performance of the SVM. About 52% of the binding sites were precisely predicted (greater than 70% of the residues in the site were identified); 77% of the binding sites were correctly predicted (greater than 50% of the residues in the site were identified), and 21% of the binding sites were partially covered by the predicted residues (some residues were identified). The results support the hypothesis that in many cases protein interfaces require some residues to provide rigidity to minimize the entropic cost upon complex formation. Proteins 2006. © 2005 Wiley‐Liss, Inc.

Từ khóa


Tài liệu tham khảo

10.1038/13783

10.1186/gb-2004-5-5-107

10.1016/S0958-1669(99)00064-6

10.1006/jmbi.1997.1234

10.1006/jmbi.1998.2439

10.1016/S0022-2836(02)01223-8

10.1016/0076-6879(91)02020-A

10.1016/S0959-440X(02)00283-X

10.1126/science.7529940

10.1006/jmbi.1998.1843

Jones S, 1996, Principles of protein–protein interactions, Proc Natl Acad Sci USA, 93, 13, 10.1073/pnas.93.1.13

Laskowski RA, 1996, Protein clefts in molecular recognition and function, Protein Sci, 5, 2438

10.1002/(SICI)1097-0134(20000601)39:4<331::AID-PROT60>3.0.CO;2-A

10.1073/pnas.1030237100

10.1016/j.sbi.2004.02.003

10.1002/prot.10115

10.1016/S0959-440X(02)00285-3

10.1073/pnas.092147999

10.1002/prot.10222

10.1016/j.jmb.2003.07.006

10.1006/jmbi.1996.0167

10.1006/jmbi.2001.4540

10.1016/S0959-440X(02)00284-1

10.1093/bioinformatics/18.suppl_1.S71

10.1093/bioinformatics/19.1.163

10.1002/prot.10074

10.1006/jmbi.2000.4092

10.1016/j.jmb.2004.02.040

10.1006/jmbi.1997.1233

10.1016/S0022-2836(02)00030-X

10.1093/protein/gzh020

10.1093/bioinformatics/bth920

10.1016/S0014-5793(03)00456-3

10.1046/j.1432-1033.2002.02767.x

10.1002/prot.1099

10.1093/nar/28.1.235

10.1002/bip.360221211

10.1016/S0022-2836(05)80134-2

10.1093/nar/30.1.264

10.1093/nar/gkh435

10.1093/nar/28.1.254

10.1093/nar/gkh034

Guda C, 2001, A new algorithm for the alignment of multiple protein structures using Monte Carlo optimization, Pac Symp Biocomput, 275

10.1093/nar/gkh464

10.1093/protein/11.9.739

10.1002/prot.340200303

Drenth J, 1994, xiii

10.1002/prot.340190207

10.1110/ps.0236203

10.1002/prot.10146

Vapnik VN, 1995, The nature of statistical learning theory, xv, 10.1007/978-1-4757-2440-0

Schölkopf B, 1999, Advances in kernel methods: support vector learning, 376

10.1093/nar/25.17.3389

10.1089/106652701300312896

10.1002/prot.1034

10.1126/science.286.5444.1579

10.1006/viro.1995.9949

10.1002/1097-0134(20000901)40:4<590::AID-PROT50>3.0.CO;2-P

10.1074/jbc.M008501200

10.3171/jns.1999.90.3.0443

Slaney SF, 1996, Differential effects of FGFR2 mutations on syndactyly and cleft palate in Apert syndrome, Am J Hum Genet, 58, 923

10.1073/pnas.121183798

10.1021/bi00079a006

10.1002/pro.5560071126

10.1002/pro.5560030317