Discrimination of HIV-1 and HIV-2 Reverse Transcriptase Proteins Using Chou’s PseAAC

Mandana Behbahani1, Hassan Mohabatkar1, Mokhtar Nosrati1
1Department of Biotechnology, Faculty of Advanced Sciences and Technologies, University of Isfahan, Isfahan, Iran

Tóm tắt

Reverse transcriptase (RT) is an important enzyme for retrovirus replication in susceptible target cells. RT of HIV-1 is one of the key targets for anti-HIV drugs. In contrast, HIV-2 RT reveals a basic resistance to non-nucleoside RT inhibitors (NNRTIs). In the present study, a comparison of different aspects of RT proteins in HIV-1 and HIV-2 such as pseudo amino acid composition (PseAAC), conventional amino acid composition (AAC), physicochemical properties, secondary structures and structural motifs has been performed. Statistical analysis and support vector machine (SVM) algorithm have been used for final comparison of two RT protein groups. The results demonstrate that AAC of four amino acids (Ala, Leu, Gln and Ser), molecular weight and percentage of alpha helix of RT proteins are significantly different between these two types. Classification based on the concept of PseAAC also showed 100% accuracy and highlighted that pseudo pI and pseudo pKa values are significant differences between two RT groups. In conclusion, the results indicate that the computational techniques can provide useful information for comparing HIV-1 and HIV-2 RTs. Our results may also explain the dissimilarity between the susceptibility of HIV-1 and HIV-2 to different drugs.

Tài liệu tham khảo

Althaus IW, Chou JJ, Gonzales AJ, Deibel MR, Chou KC, Kezdy FJ, Romero DL, Palmer JR, Thomas RC, Aristoff P (1993) Kinetic studies with the non-nucleoside HIV-1 reverse transcriptase inhibitor U-88204E. Biochemistry 32:6548–6554 Althaus IW, Chou K-C, Lemay RJ, Franks KM, Deibel MR, Kezdy FJ, Resnick L, Busso ME, So AG, Downey KM (1996) The benzylthio-pyrimidine U-31,355, a potent inhibitor of HIV-1 reverse transcriptase. Biochem Pharmacol 51:743–750 August JT, Murad F, Jeang K-T (2007) HIV I: Molecular biology and pathogenesis: clinical applications. Academic Press, London Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. https://doi.org/10.1093/nar/gkp335 Bessong PO, Obi CL (2006) Ethnopharmacology of human immunodeficiency virus in South Africa—a minireview. Afr J Biotechnol 5:1693–1699 Boyer PL, Clark PK, Hughes SH (2012) HIV-1 and HIV-2 reverse transcriptases: different mechanisms of resistance to nucleoside reverse transcriptase inhibitors. J Virol 86:5885–5894 Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2:27 Chen W, Ding H, Feng P, Lin H, Chou K-C (2016) iACP: a sequence-based tool for identifying anticancer peptides. Oncotarget 7:16895 Chou KC (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins Struct Funct Bioinform 43:246–255 Chou K-C (2015) Impacts of bioinformatics to medicinal chemistry. Med Chem 11:218–234 Chou KC, Cai YD (2003) Predicting protein quaternary structure by pseudo amino acid composition. Proteins Struct Funct Bioinform 53:282–289 Chou K-C, Shen H-B (2009) Review: recent advances in developing web-servers for predicting protein attributes. Nat Sci 1:63 Chou KC, Zhang CT, Kézdy FJ (1993) A vector projection approach to predicting HIV protease cleavage sites in proteins. Proteins Struct Funct Bioinform 16:195–204 Chou K-C, Kézdy FJ, Reusser F (1994) Kinetics of processive nucleic acid polymerases and nucleases. Anal Biochem 221:217–230 Esmaeili M, Mohabatkar H, Mohsenzadeh S (2010) Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J Theor Biol 263:203–209 Esnouf R, Ren J, Ross C, Jones Y, Stammers D, Stuart D (1995) Mechanism of inhibition of HIV-1 reverse transcriptase by non-nucleoside inhibitors. Nat Struct Mol Biol 2:303–308 Esnouf R, Ren J, Garman E, Somers DN, Ross C, Jones E, Stammers D, Stuart D (1998) Continuous and discontinuous changes in the unit cell of HIV-1 reverse transcriptase crystals on dehydration. Acta Crystallogr D 54:938–953 Fluss R, Reiser B, Faraggi D, Rotnitzky A (2009) Estimation of the ROC curve under verification bias. Biom J 51:475–490 Fu L, Niu B, Zhu Z, Wu S, Li W (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152 Gasteiger E, Hoogland C, Gattiker A, Duvaud SE, Wilkins MR, Appel RD, Bairoch A (2005) Protein identification and analysis tools on the ExPASy server. Springer, New York Guo J, Rao N, Liu G, Yang Y, Wang G (2011) Predicting protein folding rates using the concept of Chou’s pseudo amino acid composition. J Comput Chem 32:1612–1617 Hall MA (1999) Correlation-based feature selection for machine learning. The University of Waikato, Hamilton Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11:10–18 Herbeck J, Rolland M, Deng W, Collier A, Mullins J (2009) P07-06. HIV-1 transmission and early evolution: whole genome analysis. Retrovirology 6:1 Hizi A, Tal R, Shaharabany M, Loya S (1991) Catalytic properties of the reverse transcriptases of human immunodeficiency viruses type 1 and type 2. J Biol Chem 266:6230–6239 Hoffer LJ (2011) How much protein do parenteral amino acid mixtures provide? Am J Clin Nutr 94:1396–1398 Jia J, Liu Z, Xiao X, Liu B, Chou K-C (2016) pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach. J Theor Biol 394:223–230 Jian X, Wei R, Zhan T, Gu Q (2008) Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. Protein Pept Lett 15:392–396 Kandathil A, Ramalingam S, Kannangai R, Shoba D, Sridharan G (2005) Molecular epidemiology of HIV. Indian J Med Res 121:333 Khosravian M, Kazemi Faramarzi F, Mohammad Beigi M, Behbahani M, Mohabatkar H (2013) Predicting antibacterial peptides by the concept of Chou’s pseudo-amino acid composition and machine learning methods. Protein Pept Lett 20:180–186 Kloczkowski A, Ting K-L, Jernigan R, Garnier J (2002) Protein secondary structure prediction based on the GOR algorithm incorporating multiple sequence alignment information. Polymer 43:441–449 Kumar M, Thakur V, Raghava GP (2008) COPid: composition based protein identification. In Silico Biol 8:121–128 Li C, Li X, Lin Y-X (2016) Numerical characterization of protein sequences based on the generalized Chou’s Pseudo amino acid composition. Appl Sci 6:406 Liu Z, Xiao X, Yu D-J, Jia J, Qiu W-R, Chou K-C (2016) pRNAm-PC: predicting N 6-methyladenosine sites in RNA sequences via physical–chemical properties. Anal Biochem 497:60–67 Louwagie J, McCutchan FE, Peeters M, Brennan TP, Sanders-Buell E, Eddy GA, van der Groen G, Fransen K, Gershy-Damet G-M, Deleys R (1993) Phylogenetic analysis of gag genes from 70 international HIV-1 isolates provides evidence for multiple genotypes. AIDS 7:769–780 MATLAB and Statistics Toolbox Release (2013) The MathWorks, Inc., Natick, Massachusetts, United States Mohabatkar H (2010) Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept Lett 17:1207–1214 Mohabatkar H, Beigi MM, Esmaeili A (2011) Prediction of GABA A receptor proteins using the concept of Chou’s pseudo-amino acid composition and support vector machine. J Theor Biol 281:18–23 Mohabatkar H, Mohammad Beigi M, Abdolahi K, Mohsenzadeh S (2013) Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach. Med Chem 9:133–137 Nie NH, Bent DH, Hull CH (1970) SPSS: statistical package for the social sciences. McGraw-Hill, New York Qiu W-R, Sun B-Q, Xiao X, Xu Z-C, Chou K-C (2016) iHyd-PseCp: identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC. Oncotarget 7:44310 Ren J, Bird L, Chamberlain P, Stewart-Jones G, Stuart D, Stammers D (2002) Structure of HIV-2 reverse transcriptase at 2.35-Å resolution and the mechanism of resistance to non-nucleoside inhibitors. Proc Natl Acad Sci 99:14410–14415 Rodgers D, Gamblin S, Harris B, Ray S, Culp J, Hellmig B, Woolf D, Debouck C, Harrison S (1995) The structure of unliganded reverse transcriptase from the human immunodeficiency virus type 1. Proc Natl Acad Sci 92:1222–1226 Schiffer M, Edmundson AB (1967) Use of helical wheels to represent the structures of proteins and to identify segments with helical potential. Biophys J 7:121 Shen H-B, Chou K-C (2008) HIVcleave: a web-server for predicting human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 375:388–390 Sirois S, Sing T, Chou K-C (2005) HIV-1 gp120 V3 loop for structure-based drug design. Curr Protein Pept Sci 6:413–422 Stammers D, Somers DN, Ross C, Kirby I, Ray P, Wilson J, Norman M, Ren J, Esnouf R, Garman E (1994) Crystals of HIV-1 reverse transcriptase diffracting to 2·2 Å Resolution. J Mol Biol 242:586–588 Vapnik V, Golowich SE, Smola A (1997) Support vector method for function approximation, regression estimation, and signal processing. Adv Neural Inf Process Syst 9:281–287 Vergara IA, Norambuena T, Ferrada E, Slater AW, Melo F (2008) StAR: a simple tool for the statistical comparison of ROC curves. BMC Bioinform 9:1 Xiao X, Chou K-C (2011) Using pseudo amino acid composition to predict protein attributes via cellular automata and other approaches. Curr Bioinform 6:251–260 Xiao X, Shao SH, Huang ZD, Chou KC (2006) Using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor. J Comput Chem 27:478–482 Xiao X, Ye H-X, Liu Z, Jia J-H, Chou K-C (2016) iROS-gPseKNC: predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget 7:34180 Zhang C-T, Chou K-C (1994) An alternate-subsite-coupled model for predicting HIV protease cleavage sites in proteins. Protein Eng 7:65–73