In silico structural and functional characterization of hypothetical proteins from Monkeypox virus
Tóm tắt
Monkeypox virus is a small, double-stranded DNA virus that causes a zoonotic disease called Monkeypox. The disease has spread from Central and West Africa to Europe and North America and created havoc in some countries all around the world. The complete genome of the Monkeypox virus Zaire-96-I-16 has been sequenced. The viral strain contains 191 protein-coding genes with 30 hypothetical proteins whose structure and function are still unknown. Hence, it is imperative to functionally and structurally annotate the hypothetical proteins to get a clear understanding of novel drug and vaccine targets. The purpose of the study was to characterize the 30 hypothetical proteins through the determination of physicochemical properties, subcellular characterization, function prediction, functional domain prediction, structure prediction, structure validation, structural analysis, and ligand binding sites using Bioinformatics tools. The structural and functional analysis of 30 hypothetical proteins was carried out in this research. Out of these, 3 hypothetical functions (Q8V547, Q8V4S4, Q8V4Q4) could be assigned a structure and function confidently. Q8V547 protein in Monkeypox virus Zaire-96-I-16 is predicted as an apoptosis regulator which promotes viral replication in the infected host cell. Q8V4S4 is predicted as a nuclease responsible for viral evasion in the host. The function of Q8V4Q4 is to prevent host NF-kappa-B activation in response to pro-inflammatory cytokines like TNF alpha or interleukin 1 beta. Out of the 30 hypothetical proteins of Monkeypox virus Zaire-96-I-16, 3 were annotated using various bioinformatics tools. These proteins function as apoptosis regulators, nuclease, and inhibitors of NF-Kappa-B activator. The functional and structural annotation of the proteins can be used to perform a docking with potential leads to discover novel drugs and vaccines against the Monkeypox. In vivo research can be carried out to identify the complete potential of the annotated proteins.
Tài liệu tham khảo
Gong Q, Wang C, Chuai X, Chiu S (2022) Monkeypox virus: a re-emergent threat to humans. Virologica Sinica 37(4):477–482. https://doi.org/10.1016/j.virs.2022.07.006
Doshi RH, Guagliardo SA, Doty JB, Babeaux AD, Matheny A, Burgado J, Townsend MB, Morgan CN, Satheshkumar PS, Ndakala N, Kanjingankolo T (2019) Epidemiologic and ecologic investigations of monkeypox, Likouala Department, Republic of the Congo, 2017. Emerg Infect Dis 25(2):281–289. https://doi.org/10.3201/eid2502.181222
Ogoina D, Izibewule JH, Ogunleye A, Ederiane E, Anebonam U, Neni A, Oyeyemi A, Etebu EN, Ihekweazu C (2019) The 2017 human monkeypox outbreak in Nigeria—report of outbreak experience and response in the Niger Delta University Teaching Hospital, Bayelsa State, Nigeria. PLoS One 14(4):e0214229. https://doi.org/10.1371/journal.pone.0214229
World Health Organization.(2022, August 24) “Multi-country outbreak of monkeypox”. Retrieved from https://www.who.int/publications/m/item/multi-country-outbreak-of-monkeypox--external-situation-report--4---24-august-2022.
Food and Drug administration.(2023, January 2) “FDA Mpox Response”. Retrieved from https://www.fda.gov/emergency-preparedness-and-response/mcm-issues/fda-mpox-response
Cho CT, Wenner HA (1973) Monkeypox virus. Bacteriological reviews 37(1):1–8. https://doi.org/10.1128/br.37.1.1-18.1973
Pickup DJ (2015) Extracellular virions: the advance guard of poxvirus infections. PLoS Pathogens 11(7):e1004904. https://doi.org/10.1371/journal.ppat.1004904
Matho MH, Schlossman A, Gilchuk IM, Miller G, Mikulski Z, Hupfer M, Wang J, Bitra A, Meng X, Xiang Y, Kaever T (2018) Structure–function characterization of three human antibodies targeting the vaccinia virus adhesion molecule D8. J Biol Chem. 293(1):390–401. https://doi.org/10.1074/jbc.M117.814541
Chiu WL, Lin CL, Yang MH, Tzou DLM, Chang W (2007) Vaccinia virus 4c (A26L) protein on intracellular mature virus binds to the extracellular cellular matrix laminin. J virol 81(5):2149–2157. https://doi.org/10.1128/JVI.02302-06
Singh K, Gittis AG, Gitti RK, Ostazeski SA, Su HP, Garboczi DN (2016) The vaccinia virus H3 envelope protein, a major target of neutralizing antibodies, exhibits a glycosyltransferase fold and binds UDP-glucose. J Virol 90(10):5020–5030. https://doi.org/10.1128/JVI.02933-15
Schin AM, Diesterbeck US, Moss B (2021) Insights into the organization of the poxvirus multicomponent entry-fusion complex from proximity analyses in living infected cells. J Virol 95(16):e00852-e921. https://doi.org/10.1128/JVI.00852-21
Senkevich TG, Ojeda S, Townsley A, Nelson GE, Moss B (2005) Poxvirus multiprotein entry–fusion complex. Proc Nat Acad Sci 102(51):18572–18577. https://doi.org/10.1073/pnas.0509239102
Brown E, Senkevich TG, Moss B (2006) Vaccinia virus F9 virion membrane protein is required for entry but not virus assembly, in contrast to the related L1 protein. J virol 80(19):9455–9464. https://doi.org/10.1128/JVI.01149-06
Schoch CL, Ciufo S, Domrachev M, Hotton CL, Kannan S, Khovanskaya R, Leipe D, Mcveigh R, O’Neill K, Robbertse B, Sharma S(2020). NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database(Oxford).https://doi.org/10.1093/database/baaa062
Shchelkunov SN, Totmenin AV, Babkin IV, Safronov PF, Ryazankina OI, Petrov NA, Gutorov VV, Uvarova EA, Mikheev MV, Sisler JR, Esposito JJ (2001) Human monkeypox and smallpox viruses: genomic comparison. FEBS letters 509(1):66–70. https://doi.org/10.1016/S0014-5793(01)03144-1
Genome. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; 2004 – [cited 2022 August 27]. Available from: https://www.ncbi.nlm.nih.gov/genome/
The UniProt Consortium(2022), UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Research, gkac1052.https://doi.org/10.1093/nar/gkac1052
Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic acids res 31(13):3784–8. https://doi.org/10.1093/nar/gkg563
Gill SC, Von Hippel PH (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal biochem 182(2):319–26. https://doi.org/10.1016/0003-2697(89)90602-7
Guruprasad K, Reddy BB, Pandit MW (1990) Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Eng 4(2):155–161. https://doi.org/10.1093/protein/4.2.155
Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J mol biol 157(1):105–32. https://doi.org/10.1016/0022-2836(82)90515-0
Naveed M, Tehreem S, Usman M, Chaudhry Z, Abbas G (2017) Structural and functional annotation of hypothetical proteins of human adenovirus: prioritizing the novel drug targets. BMC res notes 10(1):1–6. https://doi.org/10.1186/s13104-017-2992-z
Chou KC, Shen HB (2008) Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat protoc 3(2):153–62. https://doi.org/10.1038/nprot.2007.494
Shen HB, Chou KC (2007) Virus-PLoc: a fusion classifier for predicting the subcellular localization of viral proteins within host and virus-infected cells. Biopolymers 85(3):233–240. https://doi.org/10.1002/bip.20640
Chou KC (2005) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21(1):10–9. https://doi.org/10.1093/bioinformatics/bth466
Shen HB, Chou KC (2006) Ensemble classifier for protein fold pattern recognition. Bioinformatics 22(14):1717–1722. https://doi.org/10.1093/bioinformatics/btl170
Möller S, Croning MD, Apweiler R (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17(7):646–53. https://doi.org/10.1093/bioinformatics/17.7.646
Krogh A, Larsson B, Von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J mol biol 305(3):567–80. https://doi.org/10.1006/jmbi.2000.4315
Sonnhammer EL, Von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182. https://doi.org/10.1006/jmbi.2000.4315
Tusnády GE, Simon I (1998) Principles governing amino acid composition of integral membrane proteins: applications to topology prediction. J Mol Biol 283:489–506. https://doi.org/10.1006/jmbi.1998.2107
Tusnády GE, Simon I (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17:849–850. https://doi.org/10.1093/bioinformatics/17.9.849
Mahram A, Herbordt MC (2010) Fast and accurate NCBI BLASTP: acceleration with multiphase FPGA-based prefiltering. InProceedings of the 24th ACM International Conference on Supercomputing, pp 73–82. https://doi.org/10.1145/1810085.1810099
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–10. https://doi.org/10.1016/S0022-2836(05)80360-2
Letunic I, Doerks T, Bork P (2012) SMART 7: recent updates to the protein domain annotation resource. Nucleic acids res 40(D1):D302-5. https://doi.org/10.1093/nar/gkr931
Schultz J, Copley RR, Doerks T, Ponting CP, Bork P (2000) SMART: a web-based tool for the study of genetically mobile domains. Nucleic acids res 28(1):231–4. https://doi.org/10.1093/nar/28.1.231
Pagni M, Ioannidis V, Cerutti L, Zahn-Zabal M, Jongeneel CV, Hau J, Martin O, Kuznetsov D, Falquet L (2007) MyHits: improvements to an interactive resource for analyzing protein sequences. Nucleic Acids Res 35:W433-7. https://doi.org/10.1093/nar/gkm352
Venkataraman A, Chew TH, Hussein ZA, Shamsir MS (2011) A protein short motif search tool using amino acid sequence and their secondary structure assignment. Bioinformation 7(6):304. https://doi.org/10.6026/007/97320630007304
Zdobnov EM, Apweiler R (2001) InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17(9):847–848. https://doi.org/10.1093/bioinformatics/17.9.847
Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–40. https://doi.org/10.1093/bioinformatics/btu031
Shen HB, Chou KC (2009) Predicting protein fold pattern with functional domain and sequential evolution information. J Theor Biol 256(3):441–6. https://doi.org/10.1016/j.jtbi.2008.10.007
Shen HB, Chou KC (2006) Ensemble classifier for protein fold pattern recognition. Bioinformatics 22(14):1717–22. https://doi.org/10.1093/bioinformatics/btl170
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer EL, Tosatto SC, Paladin L, Raj S, Richardson LJ, Finn RD (2021) Pfam: the protein families database in 2021. Nucleic acids res 49(D1):D412-9. https://doi.org/10.1093/nar/gkaa913
Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA (2016) The Pfam protein families database: towards a more sustainable future. Nucleic acids res 44(D1):D279-85. https://doi.org/10.1093/nar/gkv1344
Bateman A, Birney E, Durbin R, Eddy SR, Finn RD, Sonnhammer EL (1999) Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic acids res 27(1):260–2. https://doi.org/10.1093/nar/27.1.260
Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic acids res 26(1):320–2. https://doi.org/10.1093/nar/26.1.320
Sonnhammer EL, Eddy SR, Durbin R (1997) Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28(3):405–20. https://doi.org/10.1002/(SICI)1097-0134(199707)28:3%3c405::AID-PROT10%3e3.0.CO;2-L
Kundsen M, Wiuf C (2010) The CATH database. Hum genomics 4(3):207–212. https://doi.org/10.1186/1479-7364-4-3-207
Pearl FM, Lee D, Bray JE, Buchan DW, Shepherd AJ, Orengo CA (2002) The CATH extended protein-family database: providing structural annotations for genome sequences. Protein Sci 11(2):233–244. https://doi.org/10.1110/ps.16802
Wilson D, Pethica R, Zhou Y, Talbot C, Vogel C, Madera M, Chothia C, Gough J (2009) SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny. Nucleic acids res 37(suppl_1):D380-6. https://doi.org/10.1093/nar/gkn762
Wilson D, Madera M, Vogel C, Chothia C, Gough J (2007) The SUPERFAMILY database in 2007: families and functions. Nucleic acids res 35(suppl_1):D308-13. https://doi.org/10.1093/nar/gkl910
Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ (2015) The Phyre2 web portal for protein modeling, prediction and analysis. Nat protoc 10(6):845–58. https://doi.org/10.1038/nprot.2015.053
Laskowski RA, MacArthur MW, Moss DS, Thornton JM (1993) PROCHECK - a program to check the stereochemical quality of protein structures. J Applied Crystallogr 26:283–291. https://doi.org/10.1107/S0021889892009944
Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R, Thornton JM (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR 8:477–486. https://doi.org/10.1007/BF00228148. ([PubMed id: 9008363])
Laskowski R A, MacArthur M W, Thornton J M (2001). PROCHECK: validation of protein structure coordinates, in International Tables of Crystallography, Volume F. Crystallography of Biological Macromolecules, eds. Rossmann M G & Arnold E, Dordrecht, Kluwer Academic Publishers, The Netherlands, pp. 722–725.
Morris AL, MacArthur MW, Hutchinson EG, Thornton JM (1992) Stereochemical quality of protein structure coordinates. Proteins 12:345–364. https://doi.org/10.1002/prot.340120407. ([PubMed id: 1579569])
Studer G, Rempfer C, Waterhouse AM, Gumienny R, Haas J, Schwede T (2020) QMEANDisCo—distance constraints applied on model quality estimation. Bioinformatics 36(6):1765–71. https://doi.org/10.1093/bioinformatics/btz828
Kumar K, Prakash A, Anjum F, Islam A, Ahmad F, Hassan M (2015) Structure-based functional annotation of hypothetical proteins from Candida dubliniensis: a quest for potential drug targets. 3 Biotech 5(4):561–76. https://doi.org/10.1007/s13205-014-0256-3
Gligorijević V, Renfrew PD, Kosciolek T, Leman JK, Berenberg D, Vatanen T, Chandler C, Taylor BC, Fisk IM, Vlamakis H, Xavier RJ (2021) Structure-based protein function prediction using graph convolutional networks. Nat commun 12(1):1–4. https://doi.org/10.1038/s41467-021-23303-9
Yang J, Roy A, Zhang Y (2013) Protein–ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29(20):2588–95. https://doi.org/10.1093/bioinformatics/btt447
Yang J, Roy A, Zhang Y (2012) BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions. Nucleic acids res 41(D1):D1096-103. https://doi.org/10.1093/nar/gks966
Kantardjieff KA, Rupp B (2004) Protein isoelectric point as a predictor for increased crystallization screening efficiency. Bioinformatics 20(14):2162–8. https://doi.org/10.1093/bioinformatics/bth066
Gasteiger E, Hoogland C, Gattiker A, Wilkins MR, Appel RD, Bairoch A(2005). Protein identification and analysis tools on the ExPASy server. The proteomics protocols handbook.571–607.https://doi.org/10.1385/1-59259-584-7:531