A new ensemble coevolution system for detecting HIV-1 protein coevolution
Tóm tắt
Từ khóa
Tài liệu tham khảo
Zhao G, Perilla JR, Yufenyuy EL, Meng X, Chen B, Ning J, Ahn J, Gronenborn AM, Schulten K, Aiken C, Zhang P (2013) Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics. Nature 497:643–646
Waheed AA, Freed EO (2012) HIV type 1 Gag as a target for antiviral therapy. AIDS Res Hum Retroviruses 28:54–75
Bell NM, Lever AM (2013) HIV Gag polyprotein: processing and early viral particle assembly. Trends Microbiol 21:136–144
Fun A, Wensing AM, Verheyen J, Nijhuis M (2012) Human immunodeficiency virus Gag and protease: partners in resistance. Retrovirology 9:63
Carlson JM, Brumme ZL, Rousseau CM, Brumme CJ, Matthews P, Kadie C, Mullins JI, Walker BD, Harrigan PR, Goulder PJ, Heckerman D (2008) Phylogenetic dependency networks: inferring patterns of CTL escape and codon covariation in HIV-1 Gag. PLoS Comput Biol 4:e1000225
Kalinina OV, Oberwinkler H, Glass B, Krausslich HG, Russell RB, Briggs JA (2012) Computational identification of novel amino-acid interactions in HIV Gag via correlated evolution. PLoS One 7:e42468
Dahirel V, Shekhar K, Pereyra F, Miura T, Artyomov M, Talsania S, Allen TM, Altfeld M, Carrington M, Irvine DJ, Walker BD, Chakraborty AK (2011) Coordinate linkage of HIV evolution reveals regions of immunological vulnerability. Proc Natl Acad Sci U S A 108:11530–11535
Rhee SY, Liu TF, Holmes SP, Shafer RW (2007) HIV-1 subtype B protease and reverse transcriptase amino acid covariation. PLoS Comput Biol 3:e87
Rhee SY, Liu TF, Kiuchi M, Zioni R, Gifford RJ, Holmes SP, Shafer RW (2008) Natural variation of HIV-1 group M integrase: implications for a new class of antiretroviral inhibitors. Retrovirology 5:74
Beerenwinkel N, Rahnenfuhrer J, Daumer M, Hoffmann D, Kaiser R, Selbig J, Lengauer T (2005) Learning multiple evolutionary pathways from cross-sectional data. J Comput Biol 12:584–598
Travers SA, Tully DC, McCormack GP, Fares MA (2007) A study of the coevolutionary patterns operating within the env gene of the HIV-1 group M subtypes. Mol Biol Evol 24:2787–2801
Bizinoto MC, Yabe S, Leal E, Kishino H, Martins Lde O, de Lima ML, Morais ER, Diaz RS, Janini LM (2013) Codon pairs of the HIV-1 vif gene correlate with CD4+ T cell count. BMC Infect Dis 13:173
Theys K, Deforche K, Libin P, Camacho RJ, Van Laethem K, Vandamme AM (2010) Resistance pathways of human immunodeficiency virus type 1 against the combination of zidovudine and lamivudine. J Gen Virol 91:1898–1908
Fares MA, Travers SA (2006) A novel method for detecting intramolecular coevolution: adding a further dimension to selective constraints analyses. Genetics 173:9–23
Lovell SC, Robertson DL (2010) An integrated view of molecular coevolution in protein-protein interactions. Mol Biol Evol 27:2567–2575
Lockless SW, Ranganathan R (1999) Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286:295–299
Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C, Marks DS (2012) Three-dimensional structures of membrane proteins from genomic sequencing. Cell 149:1607–1621
Ashkenazy H, Kliger Y (2010) Reducing phylogenetic bias in correlated mutation analysis. Protein Eng Des Sel 23:321–326
Weigt M, White RA, Szurmant H, Hoch JA, Hwa T (2009) Identification of direct residue contacts in protein-protein interaction by message passing. Proc Natl Acad Sci U S A 106:67–72
Suel GM, Lockless SW, Wall MA, Ranganathan R (2003) Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat Struct Biol 10:59–69
Rausell A, Juan D, Pazos F, Valencia A (2010) Protein interactions and ligand binding: from protein subfamilies to functional specificity. Proc Natl Acad Sci U S A 107:1995–2000
de Juan D, Pazos F, Valencia A (2013) Emerging methods in protein co-evolution. Nat Rev Genet 14:249–261
Fitch WM, Markowitz E (1970) An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet 4:579–593
Horner DS, Pirovano W, Pesole G (2008) Correlated substitution analysis and the prediction of amino acid structural contacts. Brief Bioinform 9:46–56
Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci U S A 108:E1293–E1301
Ekeberg M, Lovkvist C, Lan Y, Weigt M, Aurell E (2013) Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys Rev E Stat Nonlin Soft Matter Phys 87:012707
Liu Y, Bahar I (2012) Sequence evolution correlates with structural dynamics. Mol Biol Evol 29:2253–2263
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33:1–39
Breiman L (2001) Random forests. Mach Learn 45:5–32
Freund Y, Schapire RE: Experiments with a new boosting algorithm. In ICML 1996, 148–156.
Troć M, Unold O (2010) Self-Adaptation of Parameters in a Learning Classifier System Ensemble Machine
Gao Y, Huang JZ, Wu L (2007) Learning classifier system ensemble and compact rule set. Connect Sci 19:321–337
Bacardit J, Krasnogor N: Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In Learning Classifier Systems. Berlin Heidelberg: Springer; 2008, 4998:255–268.
Dunn SD, Wahl LM, Gloor GB (2008) Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 24:333–340
Deforche K, Silander T, Camacho R, Grossman Z, Soares MA, Van Laethem K, Kantor R, Moreau Y, Vandamme AM, Non BW (2006) Analysis of HIV-1 pol sequences using Bayesian Networks: implications for drug resistance. Bioinformatics 22:2975–2979
Yeang CH, Haussler D (2007) Detecting coevolution in and among protein domains. PLoS Comput Biol 3:e211
Dutheil J, Galtier N (2007) Detecting groups of coevolving positions in a molecule: a clustering approach. BMC Evol Biol 7:242
Halperin I, Wolfson H, Nussinov R (2006) Correlated mutations: advances and limitations. A study on fusion proteins and on the Cohesin-Dockerin families. Proteins 63:832–845
Di Lena P, Nagata K, Baldi P (2012) Deep architectures for protein contact map prediction. Bioinformatics 28:2449–2457
Eickholt J, Cheng J (2012) Predicting protein residue-residue contacts using deep networks and boosting. Bioinformatics 28:3066–3072
Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci U S A 110:15674–15679
Tillier ER, Lui TW (2003) Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments. Bioinformatics 19:750–755
Burger L, van Nimwegen E (2010) Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput Biol 6:e1000633
Ackerman SH, Tillier ER, Gatti DL (2012) Accurate simulation and detection of coevolution signals in multiple sequence alignments. PLoS One 7:e47108
Bremm S, Schreck T, Boba P, Held S, Hamacher K (2010) Computing and visually analyzing mutual information in molecular co-evolution. BMC Bioinform 11:330
Gao H, Dou Y, Yang J, Wang J (2011) New methods to measure residues coevolution in proteins. BMC Bioinform 12:206
Lee BC, Kim D (2009) A new method for revealing correlated mutations under the structural and functional constraints in proteins. Bioinformatics 25:2506–2513
Tegge AN, Wang Z, Eickholt J, Cheng J (2009) NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 37:W515–W518
Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28:184–190
Wang Z, Xu J (2013) Predicting protein contact map using evolutionary and physical constraints by integer programming. Bioinformatics 29:i266–i273
Gouveia-Oliveira R, Pedersen AG (2007) Finding coevolving amino acid residues using row and column weighting of mutual information and multi-dimensional amino acid representation. Algorithms Mol Biol 2:12
Poon AF, Lewis FI, Frost SD, Kosakovsky Pond SL (2008) Spidermonkey: rapid detection of co-evolving sites using Bayesian graphical models. Bioinformatics 24:1949–1950
Halabi N, Rivoire O, Leibler S, Ranganathan R (2009) Protein sectors: evolutionary units of three-dimensional structure. Cell 138:774–786
Cheng J, Baldi P (2007) Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinform 8:113
Little DY, Chen L (2009) Identification of coevolving residues and coevolution potentials emphasizing structure, bond formation and catalytic coordination in protein evolution. PLoS One 4:e4762
Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224
Li G, Verheyen J, Rhee SY, Voet A, Vandamme AM, Theys K (2013) Functional conservation of HIV-1 gag: implications for rational drug design. Retrovirology 10:126
Minh BQ, Le Vinh S, Von Haeseler A, Schmidt HA (2005) pIQPNNI: parallel reconstruction of large maximum likelihood phylogenies. Bioinformatics 21:3794–3796
Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690
Hooft RW, Vriend G, Sander C, Abola EE (1996) Errors in protein structures. Nature 381:272
Brodersen KH, Ong CS, Stephan KE, Buhmann JM: The binormal assumption on precision-recall curves. In Pattern Recognition (ICPR), 2010 20th International Conference on. IEEE; 2010:4263–4266.
Li Y, Fang Y, Fang J (2011) Predicting residue-residue contacts using random forest models. Bioinformatics 27:3379–3384
Wolda H (1981) Similarity indices, sample size and diversity. Oecologia 50:296–302
Polikar R: Ensemble learning. In Ensemble Machine Learning. Springer; 2012:1–34.
Krogh A, Sollich P (1997) Statistical mechanics of ensemble learning. Phys Rev E 55:811
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Machine Learning Res 3:1157–1182
Sinisi SE, Polley EC, Petersen ML, Rhee SY, Van Der Laan MJ: Super learning: an application to the prediction of HIV-1 drug resistance. Stat Appl Genet Mol Biol 2007, 6:Article7.
Gama J, Brazdil P (2000) Cascade generalization. Mach Learn 41:315–343
Saha I, Zubek J, Klingstrom T, Forsberg S, Wikander J, Kierczak M, Maulik U, Plewczynski D: Ensemble learning prediction of protein-protein interactions using proteins functional annotations. Mol BioSyst 2014.
Yang J, Jang R, Zhang Y, Shen HB (2013) High-accuracy prediction of transmembrane inter-helix contacts and application to GPCR 3D structure modeling. Bioinformatics 29:2579–2587
Skwark MJ, Abdel-Rehim A, Elofsson A (2013) PconsC: combination of direct information methods and alignments improves contact prediction. Bioinformatics 29:1815–1816
Dutheil JY (2012) Detecting coevolving positions in a molecule: why and how to account for phylogeny. Brief Bioinform 13:228–243
Hakes L, Lovell SC, Oliver SG, Robertson DL (2007) Specificity in protein interactions and its relationship with sequence diversity and coevolution. Proc Natl Acad Sci U S A 104:7999–8004
Ha JH, Loh SN (2012) Protein conformational switches: from nature to design. Chemistry 18:7984–7999
Fodor AA, Aldrich RW (2004) Influence of conservation on calculations of amino acid covariance in multiple sequence alignments. Proteins 56:211–221
Morikawa Y, Zhang WH, Hockley DJ, Nermut MV, Jones IM (1998) Detection of a trimeric human immunodeficiency virus type 1 Gag intermediate is dependent on sequences in the matrix protein, p17. J Virol 72:7659–7663
Kiernan RE, Ono A, Freed EO (1999) Reversion of a human immunodeficiency virus type 1 matrix mutation affecting Gag membrane binding, endogenous reverse transcriptase activity, and virus infectivity. J Virol 73:4728–4737
Tedbury PR, Ablan SD, Freed EO (2013) Global rescue of defects in HIV-1 envelope glycoprotein incorporation: implications for matrix structure. PLoS Pathog 9:e1003739
Pornillos O, Ganser-Pornillos BK, Yeager M (2011) Atomic-level modelling of the HIV capsid. Nature 469:424–427
Pornillos O, Ganser-Pornillos BK, Kelly BN, Hua Y, Whitby FG, Stout CD, Sundquist WI, Hill CP, Yeager M (2009) X-ray structures of the hexameric building block of the HIV capsid. Cell 137:1282–1292
Byeon IJ, Meng X, Jung J, Zhao G, Yang R, Ahn J, Shi J, Concel J, Aiken C, Zhang P, Gronenborn AM (2009) Structural convergence between Cryo-EM and NMR reveals intersubunit interactions critical for HIV-1 capsid function. Cell 139:780–790
Yufenyuy EL, Aiken C (2013) The NTD-CTD intersubunit interface plays a critical role in assembly and stabilization of the HIV-1 capsid. Retrovirology 10:29
Liang C, Hu J, Russell RS, Roldan A, Kleiman L, Wainberg MA (2002) Characterization of a putative α-helix across the capsid-SP1 boundary that is critical for the multimerization of human immunodeficiency virus type 1 Gag. J Virol 76:11729–11737
Liu Y, Eyal E, Bahar I (2008) Analysis of correlated mutations in HIV-1 protease using spectral clustering. Bioinformatics 24:1243–1250
Haq O, Levy RM, Morozov AV, Andrec M (2009) Pairwise and higher-order correlations among drug-resistance mutations in HIV-1 subtype B protease. BMC Bioinform 10(Suppl 8):S10
Li G, Verheyen J, Theys K, Piampongsant S, Van Laethem K, Vandamme AM (2014) HIV-1 Gag C-terminal amino acid substitutions emerging under selective pressure of protease inhibitors in patient populations infected with different HIV-1 subtypes. Retrovirology 11:79
Prabu-Jeyabalan M, Nalivaika E, Schiffer CA (2002) Substrate shape determines specificity of recognition for HIV-1 protease: analysis of crystal structures of six substrate complexes. Structure 10:369–381
Lee SK, Potempa M, Kolli M, Ozen A, Schiffer CA, Swanstrom R (2012) Context surrounding processing sites is crucial in determining cleavage rate of a subset of processing sites in HIV-1 Gag and Gag-Pro-Pol polyprotein precursors by viral protease. J Biol Chem 287:13279–13290
Vercauteren J, Beheydt G, Prosperi M, Libin P, Imbrechts S, Camacho R, Clotet B, De Luca A, Grossman Z, Kaiser R, Sonnerborg A, Torti C, Van Wijngaerden E, Schmit JC, Zazzi M, Geretti AM, Vandamme AM, Van Laethem K (2013) Clinical evaluation of Rega 8: an updated genotypic interpretation system that significantly predicts HIV-therapy response. PLoS One 8:e61436
Watanabe SM, Chen MH, Khan M, Ehrlich L, Kemal KS, Weiser B, Shi B, Chen C, Powell M, Anastos K, Burger H, Carter CA (2013) The S40 residue in HIV-1 Gag p6 impacts local and distal budding determinants, revealing additional late domain activities. Retrovirology 10:143
Datta SA, Curtis JE, Ratcliff W, Clark PK, Crist RM, Lebowitz J, Krueger S, Rein A (2007) Conformation of the HIV-1 Gag protein in solution. J Mol Biol 365:812–824
Gong S, Park C, Choi H, Ko J, Jang I, Lee J, Bolser DM, Oh D, Kim DS, Bhak J (2005) A protein domain interaction interface database: InterPare. BMC Bioinform 6:207
Soundararajan V, Raman R, Raguram S, Sasisekharan V, Sasisekharan R (2010) Atomic interaction networks in the core of protein domains and their native folds. PLoS One 5:e9391
Li G: HIV genome-wide diversity, interaction and coevolution. Doctoral thesis, University of Leuven, Belgium. 2014 (https://lirias.kuleuven.be/handle/123456789/460408).