Using Support Vector Machine Combined with Post-processing Procedure to Improve Prediction of Interface Residues in Transient Complexes

The Protein Journal - Tập 28 - Trang 369-374 - 2009
Rong Liu1, Yanhong Zhou1
1Hubei Bioinformatics and Molecular Imaging Key Laboratory, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China

Tóm tắt

Reliable prediction of interface residues in transient complexes remains challenging, yet is highly desirable for the design of new drugs. The existing computational methods mainly rely on evolutionary information to identify these key residues, but evolutionary information may not be effective for the interface residues in all types of transient complexes, such as antigen–antibody complexes. Herein we combined B-factor with sequence profile and accessible surface area to predict these important residues using support vector machine (SVM). Furthermore, a post-processing method was developed to reduce the number of false positives recognized by SVM. The prediction results show that B-factor is an effective indicator for the interface residues in antigen–antibody complexes as well as those in other types of transient complexes. In addition, we found that the post-processing procedure made an important contribution to further improve the prediction performance. Consequently, the proposed approach could provide new insight into accurately predicting interface residues in different types of transient complexes.

Tài liệu tham khảo

Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Nucleic Acids Res 25:3389–3402 Bradford JR, Westhead DR (2005) Bioinformatics 21:1487–1494 Bradford JR, Needham CJ, Bulpitt AJ, Westhead DR (2006) J Mol Biol 362:365–386 Chang CC, Lin CJ (2001) Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm Chen H, Zhou HX (2005) Proteins 61:21–35 Chung JL, Wang W, Bourne PE (2006) Proteins 62:630–640 Dong Q, Wang X, Lin L, Guan Y (2007) BMC Bioinformatics 8:147 Fariselli P, Pazos F, Valencia A, Casadio R (2002) Eur J Biochem 269:1356–1361 Fariselli P, Zauli A, Rossi I, Finelli M, Martelli PL, Casadio R (2003) In: IEEE XIII workshop on neural networks for signal processing. pp 33–41 Friedrich T, Pils B, Dandekar T, Schultz J, Müller T (2006) Bioinformatics 22:2851–2857 Hoskins J, Lovell S, Blundell TL (2006) Protein Sci 15:1017–1029 Hwang H, Pierce B, Mintseris J, Janin J, Weng Z (2008) Proteins 73:705–709 Jones S, Thornton JM (1996) Proc Natl Acad Sci USA 93:13–20 Jones S, Thornton JM (1997) J Mol Biol 272:121–132 Jones S, Thornton JM (1997) J Mol Biol 272:133–143 Kabsch W, Sander C (1983) Biopolymers 22:2577–2637 Keskin O, Gursoy A, Ma B, Nussinov R (2008) Chem Rev 108:1225–1244 Koike A, Takagi T (2004) Protein Eng Des Sel 17:165–173 Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) Nucleic Acids Res 33:W299–W302 Li JJ, Huang DS, Wang B, Chen P (2006) Int J Biol Macromol 38:241–247 Li MH, Lin L, Wang XL, Liu T (2007) Bioinformatics 23:597–604 Liang S, Zhang C, Liu S, Zhou Y (2006) Nucleic Acids Res 34:3698–3707 Liu R, Jiang W, Zhou Y (2009) Amino Acids. doi:10.1007/s00726–009–0245–8 Madabushi S, Yao H, Marsh M, Kristensen DM, Philippi A, Sowa ME, Lichtarge O (2002) J Mol Biol 316:139–154 Mintseris J, Weng Z (2003) Proteins 53:629–639 Mintseris J, Wiehe K, Pierce B, Anderson R, Chen R, Janin J, Weng Z (2005) Proteins 60:214–216 Mirza O, Henriksen A, Ipsen H, Larsen JN, Wissenbach M, Spangfort MD, Gajhede M (2002) J Immunol 165:331–338 Neuvirth H, Raz R, Schreiber G (2004) J Mol Biol 338:181–199 Ofran Y, Rost B (2003) FEBS Lett 544:236–239 Ofran Y, Rost B (2007) Bioinformatics 23:e13–e16 Ofran Y, Schlessinger A, Rost B (2008) J Immunol 181:6230–6235 Res I, Mihalek I, Lichtarge O (2005) Bioinformatics 21:2496–2501 Rost B, Sander C (1994) Proteins 20:216–226 Tseng YY, Liang J (2007) Ann Biomed Eng 35:1037–1042 Vapnik VN (1995) The nature of statistical learning theory. Springer, New York Wang B, Wong HS, Huang DS (2006) Protein Pept Lett 13:999–1005 Wang Y, Xue Z, Shen G, Xu J (2008) Amino Acids 35:295–302 Yan C, Dobbs D, Honavar V (2004) Bioinformatics 20:i371–i378 Yuan Z, Zhao J, Wang ZX (2003) Protein Eng 16:109–114 Zhou HX, Shan Y (2001) Proteins 44:336–343