Discrimination of HIV-1 and HIV-2 Reverse Transcriptase Proteins Using Chou’s PseAAC
Iranian Journal of Science and Technology, Transactions A: Science - Tập 42 - Trang 1805-1811 - 2017
Tóm tắt
Reverse transcriptase (RT) is an important enzyme for retrovirus replication in susceptible target cells. RT of HIV-1 is one of the key targets for anti-HIV drugs. In contrast, HIV-2 RT reveals a basic resistance to non-nucleoside RT inhibitors (NNRTIs). In the present study, a comparison of different aspects of RT proteins in HIV-1 and HIV-2 such as pseudo amino acid composition (PseAAC), conventional amino acid composition (AAC), physicochemical properties, secondary structures and structural motifs has been performed. Statistical analysis and support vector machine (SVM) algorithm have been used for final comparison of two RT protein groups. The results demonstrate that AAC of four amino acids (Ala, Leu, Gln and Ser), molecular weight and percentage of alpha helix of RT proteins are significantly different between these two types. Classification based on the concept of PseAAC also showed 100% accuracy and highlighted that pseudo pI and pseudo pKa values are significant differences between two RT groups. In conclusion, the results indicate that the computational techniques can provide useful information for comparing HIV-1 and HIV-2 RTs. Our results may also explain the dissimilarity between the susceptibility of HIV-1 and HIV-2 to different drugs.
Tài liệu tham khảo
Althaus IW, Chou JJ, Gonzales AJ, Deibel MR, Chou KC, Kezdy FJ, Romero DL, Palmer JR, Thomas RC, Aristoff P (1993) Kinetic studies with the non-nucleoside HIV-1 reverse transcriptase inhibitor U-88204E. Biochemistry 32:6548–6554
Althaus IW, Chou K-C, Lemay RJ, Franks KM, Deibel MR, Kezdy FJ, Resnick L, Busso ME, So AG, Downey KM (1996) The benzylthio-pyrimidine U-31,355, a potent inhibitor of HIV-1 reverse transcriptase. Biochem Pharmacol 51:743–750
August JT, Murad F, Jeang K-T (2007) HIV I: Molecular biology and pathogenesis: clinical applications. Academic Press, London
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. https://doi.org/10.1093/nar/gkp335
Bessong PO, Obi CL (2006) Ethnopharmacology of human immunodeficiency virus in South Africa—a minireview. Afr J Biotechnol 5:1693–1699
Boyer PL, Clark PK, Hughes SH (2012) HIV-1 and HIV-2 reverse transcriptases: different mechanisms of resistance to nucleoside reverse transcriptase inhibitors. J Virol 86:5885–5894
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2:27
Chen W, Ding H, Feng P, Lin H, Chou K-C (2016) iACP: a sequence-based tool for identifying anticancer peptides. Oncotarget 7:16895
Chou KC (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins Struct Funct Bioinform 43:246–255
Chou K-C (2015) Impacts of bioinformatics to medicinal chemistry. Med Chem 11:218–234
Chou KC, Cai YD (2003) Predicting protein quaternary structure by pseudo amino acid composition. Proteins Struct Funct Bioinform 53:282–289
Chou K-C, Shen H-B (2009) Review: recent advances in developing web-servers for predicting protein attributes. Nat Sci 1:63
Chou KC, Zhang CT, Kézdy FJ (1993) A vector projection approach to predicting HIV protease cleavage sites in proteins. Proteins Struct Funct Bioinform 16:195–204
Chou K-C, Kézdy FJ, Reusser F (1994) Kinetics of processive nucleic acid polymerases and nucleases. Anal Biochem 221:217–230
Esmaeili M, Mohabatkar H, Mohsenzadeh S (2010) Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J Theor Biol 263:203–209
Esnouf R, Ren J, Ross C, Jones Y, Stammers D, Stuart D (1995) Mechanism of inhibition of HIV-1 reverse transcriptase by non-nucleoside inhibitors. Nat Struct Mol Biol 2:303–308
Esnouf R, Ren J, Garman E, Somers DN, Ross C, Jones E, Stammers D, Stuart D (1998) Continuous and discontinuous changes in the unit cell of HIV-1 reverse transcriptase crystals on dehydration. Acta Crystallogr D 54:938–953
Fluss R, Reiser B, Faraggi D, Rotnitzky A (2009) Estimation of the ROC curve under verification bias. Biom J 51:475–490
Fu L, Niu B, Zhu Z, Wu S, Li W (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152
Gasteiger E, Hoogland C, Gattiker A, Duvaud SE, Wilkins MR, Appel RD, Bairoch A (2005) Protein identification and analysis tools on the ExPASy server. Springer, New York
Guo J, Rao N, Liu G, Yang Y, Wang G (2011) Predicting protein folding rates using the concept of Chou’s pseudo amino acid composition. J Comput Chem 32:1612–1617
Hall MA (1999) Correlation-based feature selection for machine learning. The University of Waikato, Hamilton
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11:10–18
Herbeck J, Rolland M, Deng W, Collier A, Mullins J (2009) P07-06. HIV-1 transmission and early evolution: whole genome analysis. Retrovirology 6:1
Hizi A, Tal R, Shaharabany M, Loya S (1991) Catalytic properties of the reverse transcriptases of human immunodeficiency viruses type 1 and type 2. J Biol Chem 266:6230–6239
Hoffer LJ (2011) How much protein do parenteral amino acid mixtures provide? Am J Clin Nutr 94:1396–1398
Jia J, Liu Z, Xiao X, Liu B, Chou K-C (2016) pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach. J Theor Biol 394:223–230
Jian X, Wei R, Zhan T, Gu Q (2008) Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. Protein Pept Lett 15:392–396
Kandathil A, Ramalingam S, Kannangai R, Shoba D, Sridharan G (2005) Molecular epidemiology of HIV. Indian J Med Res 121:333
Khosravian M, Kazemi Faramarzi F, Mohammad Beigi M, Behbahani M, Mohabatkar H (2013) Predicting antibacterial peptides by the concept of Chou’s pseudo-amino acid composition and machine learning methods. Protein Pept Lett 20:180–186
Kloczkowski A, Ting K-L, Jernigan R, Garnier J (2002) Protein secondary structure prediction based on the GOR algorithm incorporating multiple sequence alignment information. Polymer 43:441–449
Kumar M, Thakur V, Raghava GP (2008) COPid: composition based protein identification. In Silico Biol 8:121–128
Li C, Li X, Lin Y-X (2016) Numerical characterization of protein sequences based on the generalized Chou’s Pseudo amino acid composition. Appl Sci 6:406
Liu Z, Xiao X, Yu D-J, Jia J, Qiu W-R, Chou K-C (2016) pRNAm-PC: predicting N 6-methyladenosine sites in RNA sequences via physical–chemical properties. Anal Biochem 497:60–67
Louwagie J, McCutchan FE, Peeters M, Brennan TP, Sanders-Buell E, Eddy GA, van der Groen G, Fransen K, Gershy-Damet G-M, Deleys R (1993) Phylogenetic analysis of gag genes from 70 international HIV-1 isolates provides evidence for multiple genotypes. AIDS 7:769–780
MATLAB and Statistics Toolbox Release (2013) The MathWorks, Inc., Natick, Massachusetts, United States
Mohabatkar H (2010) Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept Lett 17:1207–1214
Mohabatkar H, Beigi MM, Esmaeili A (2011) Prediction of GABA A receptor proteins using the concept of Chou’s pseudo-amino acid composition and support vector machine. J Theor Biol 281:18–23
Mohabatkar H, Mohammad Beigi M, Abdolahi K, Mohsenzadeh S (2013) Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach. Med Chem 9:133–137
Nie NH, Bent DH, Hull CH (1970) SPSS: statistical package for the social sciences. McGraw-Hill, New York
Qiu W-R, Sun B-Q, Xiao X, Xu Z-C, Chou K-C (2016) iHyd-PseCp: identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC. Oncotarget 7:44310
Ren J, Bird L, Chamberlain P, Stewart-Jones G, Stuart D, Stammers D (2002) Structure of HIV-2 reverse transcriptase at 2.35-Å resolution and the mechanism of resistance to non-nucleoside inhibitors. Proc Natl Acad Sci 99:14410–14415
Rodgers D, Gamblin S, Harris B, Ray S, Culp J, Hellmig B, Woolf D, Debouck C, Harrison S (1995) The structure of unliganded reverse transcriptase from the human immunodeficiency virus type 1. Proc Natl Acad Sci 92:1222–1226
Schiffer M, Edmundson AB (1967) Use of helical wheels to represent the structures of proteins and to identify segments with helical potential. Biophys J 7:121
Shen H-B, Chou K-C (2008) HIVcleave: a web-server for predicting human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 375:388–390
Sirois S, Sing T, Chou K-C (2005) HIV-1 gp120 V3 loop for structure-based drug design. Curr Protein Pept Sci 6:413–422
Stammers D, Somers DN, Ross C, Kirby I, Ray P, Wilson J, Norman M, Ren J, Esnouf R, Garman E (1994) Crystals of HIV-1 reverse transcriptase diffracting to 2·2 Å Resolution. J Mol Biol 242:586–588
Vapnik V, Golowich SE, Smola A (1997) Support vector method for function approximation, regression estimation, and signal processing. Adv Neural Inf Process Syst 9:281–287
Vergara IA, Norambuena T, Ferrada E, Slater AW, Melo F (2008) StAR: a simple tool for the statistical comparison of ROC curves. BMC Bioinform 9:1
Xiao X, Chou K-C (2011) Using pseudo amino acid composition to predict protein attributes via cellular automata and other approaches. Curr Bioinform 6:251–260
Xiao X, Shao SH, Huang ZD, Chou KC (2006) Using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor. J Comput Chem 27:478–482
Xiao X, Ye H-X, Liu Z, Jia J-H, Chou K-C (2016) iROS-gPseKNC: predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget 7:34180
Zhang C-T, Chou K-C (1994) An alternate-subsite-coupled model for predicting HIV protease cleavage sites in proteins. Protein Eng 7:65–73