BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification
Tóm tắt
A number of computational tools for metabolism prediction have been developed over the last 20 years to predict the structures of small molecules undergoing biological transformation or environmental degradation. These tools were largely developed to facilitate absorption, distribution, metabolism, excretion, and toxicity (ADMET) studies, although there is now a growing interest in using such tools to facilitate metabolomics and exposomics studies. However, their use and widespread adoption is still hampered by several factors, including their limited scope, breath of coverage, availability, and performance. To address these limitations, we have developed BioTransformer, a freely available software package for accurate, rapid, and comprehensive in silico metabolism prediction and compound identification. BioTransformer combines a machine learning approach with a knowledge-based approach to predict small molecule metabolism in human tissues (e.g. liver tissue), the human gut as well as the environment (soil and water microbiota), via its metabolism prediction tool. A comprehensive evaluation of BioTransformer showed that it was able to outperform two state-of-the-art commercially available tools (Meteor Nexus and ADMET Predictor), with precision and recall values up to 7 times better than those obtained for Meteor Nexus or ADMET Predictor on the same sets of pharmaceuticals, pesticides, phytochemicals or endobiotics under similar or identical constraints. Furthermore BioTransformer was able to reproduce 100% of the transformations and metabolites predicted by the EAWAG pathway prediction system. Using mass spectrometry data obtained from a rat experimental study with epicatechin supplementation, BioTransformer was also able to correctly identify 39 previously reported epicatechin metabolites via its metabolism identification tool, and suggest 28 potential metabolites, 17 of which matched nine monoisotopic masses for which no evidence of a previous report could be found. BioTransformer can be used as an open access command-line tool, or a software library. It is freely available at
https://bitbucket.org/djoumbou/biotransformerjar/
. Moreover, it is also freely available as an open access RESTful application at
www.biotransformer.ca
, which allows users to manually or programmatically submit queries, and retrieve metabolism predictions or compound identification data.
Tài liệu tham khảo
Nelson DL, Cox MM (2012) Lehninger principles of biochemistry, 6th edn. W H Freeman & Co (Sd), New York
Wishart DS, Feunang YD, Marcu A, Guo AC, Liang K, Vázquez-Fresno R et al (2018) HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res 46(D1):D608–D617
Uppal K, Walker DI, Liu K, Li S, Go Y, Jones DP (2016) Computational metabolomics: a framework for the million metabolome. Chem Res Toxicol 29(12):1956–1975
Arora B, Mukherjee J, Nath Gupta M (2014) Enzyme promiscuity: using the dark side of enzyme specificity in white biotechnology. Sustain Chem Process 2:25
Testa B, Pedretti A, Vistoli G (2012) Reactions and enzymes in the metabolism of drugs and other xenobiotics. Drug Discov Today 17(11–12):549–560
Dueñas M, Muñoz-González I, Cueva C, Jiménez-Girón A, Sánchez-Patán F, Santos-Buelga C et al (2015) A survey of modulation of gut microbiota by dietary polyphenols. Biomed Res Int. https://doi.org/10.1155/2015/850902
Koppel N, Rekdal VM, Balskus EP (2017) Chemical transformation of xenobiotics by the human gut microbiota. Science 356(6344):1246–1257
Testa B (2009) Drug metabolism for the perplexed medicinal chemist. Chem Biodivers 6(11):2055–2070
Aktar W, Sengupta D, Chowdhury A (2009) Impact of pesticides use in agriculture: their benefits and hazards. Interdiscip Toxicol 2(1):1–12
Tang J, Cao Y, Rose RL, Brimfield AA, Dai D, Goldstein JA et al (2001) Metabolism of chlorpyrifos by human cytochrome p450 isoforms and human, mouse, and rat liver microsomes. Drug Metab Dispos 29(9):1201–1204
Joly C, Gay-Quéheillard J, Léké A, Chardon K, Delanaud S, Bach V et al (2013) Impact of chronic exposure to low doses of chlorpyrifos on the intestinal microbiota in the simulator of the human intestinal microbial ecosystem (SHIME®) and in the rat. Environ Sci Pollut Res 20(5):2726–2734
Supreeth M, Chandrashekar MA, Sachin N, Raju NS (2016) Effect of chlorpyrifos on soil microbial diversity and its biotransformation by Streptomyces sp. HP-11. 3 Biotech 6(2):147
Benzidane C, Dahamna S (2013) Chlorpyrifos residues in food plant in the region of Setif-Algeria. Commun Agric Appl Biol Sci 78(2):157–160
Shamasunder B (2017) Chlorpyrifos contamination across the food system: shifting science, regulatory challenges, and implications for public health. In: Hoflund AB, Jones JC, Pautz MC (eds) The intersection of food and public health: current policy challenges and solutions. Routledge, New York, pp 107–120
Ebele AJ, Abou-Elwafa Abdallah M, Harrad S (2017) Pharmaceuticals and personal care products (PPCPs) in the freshwater aquatic environment. Emerg Contam 3(1):1–16
Blair BD, Crago JP, Hedman CJ, Klaper RD (2013) Pharmaceuticals and personal care products found in the Great Lakes above concentrations of environmental concern. Chemosphere 93(9):2116–2123
Coleman S, Linderman R, Hodgson E, Rose RL (2000) Comparative metabolism of chloroacetamide herbicides and selected metabolites in human and rat liver microsomes. Environ Health Perspect 108(12):1151–1157
Wishart DS (2009) Computational strategies for metabolite identification in metabolomics. Bioanalysis 1(9):1579–1596
Celiz M, Tso J, Aga D (2009) Pharmaceutical metabolites in the environment: analytical challenges and ecological risks. Environ Toxicol Chem 28(12):173
Geissen V, Mol H, Klumpp E, Umlauf G, Nadal M, van der Ploeg M et al (2015) Emerging pollutants in the environment: a challenge for water resource management. Int Soil Water Conserv Res 3(1):57–65
Basheer C, Alnedhary AA, Rao BSM, Lee HK (2007) Determination of organophosphorous pesticides in wastewater samples using binary-solvent liquid-phase microextraction and solid-phase microextraction: a comparative study. Anal Chim Acta 605(2):147–152
Hubert J, Nuzillard J, Renault J (2017) Dereplication strategies in natural product research: How many tools and methodologies behind the same concept? Phytochem Rev 16(1):55–95
Liu R, Liu J, Tawa G, Wallqvist A (2012) 2D SMARTCyp reactivity-based site of metabolism prediction for major drug-metabolizing cytochrome P450 enzymes. J Chem Inf Model 52(6):1698–1712
Rydberg P, Gloriam DE, Olsen L (2010) The SMARTCyp cytochrome P450 metabolism prediction server. Bioinformatics 26(23):2988–2989
Terfloth L, Bienfait B, Gasteiger J (2007) Ligand-based models for the isoform specificity of cytochrome P450 3A4, 2D6, and 2C9 substrates. J Chem Inf Model 47(4):1688–1701
Marchant CA, Briggs KA, Long A (2008) In silico tools for sharing data and knowledge on toxicity and metabolism: Derek for windows, meteor, and vitic. Toxicol Mech Methods 18(2–3):177–187
Ridder L, Wagener M (2008) SyGMa: combining expert knowledge and empirical scoring in the prediction of metabolites. ChemMedChem 3(5):821–832
COMPUDRUG (2013) Metabolexpert. http://www.compudrug.com/metabolexpert. Accessed 1 Jan 2017
ADMET Predictor (2018) Simulations Plus, Inc., Lancaster, California, USA. https://www.simulations-plus.com/software/admetpredictor/metabolism. Accessed 1 Jan 2018
Zaretzki J, Matlock M, Swamidass SJ (2013) XenoSite: accurately predicting cyp-mediated sites of metabolism with neural networks. J Chem Inf Model 53(12):3373–3383
Wicker J, Lorsbach T, Gütlein M, Schmid E, Latino D, Kramer S et al (2016) enviPath—the environmental contaminant biotransformation pathway resource. Nucleic Acids Res 44:D502
Gao J, Ellis LBM, Wackett LP (2009) The University of Minnesota biocatalysis/biodegradation database: improving public access. Nucleic Acids Res 38(Suppl. 1):D488–D491
Ellis LB, Gao J, Fenner K, Wackett LP (2008) The University of Minnesota pathway prediction system: predicting metabolic logic. Nucleic Acids Res 36(Web Server issue):W427–W432
Wicker J, Fenner K, Ellis L, Wackett L, Kramer S (2010) Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach. Bioinformatics 26(6):814–821
Molecular Discovery (2017) Mass-MetaSite. https://www.moldiscovery.com/software/massmetasite/. Accessed 15 Jan 2017
SCIEX—LightSight® Software (2018) https://sciex.com/products/software/lightsight-software. Accessed 20 Apr 2018
Kirchmair J, Göller AH, Lang D, Kunze J, Testa B, Wilson ID et al (2015) Predicting drug metabolism: experiment and/or computation? Nat Rev Drug Discov 14(6):387–404
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR et al (2018) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 46(D1):D1074–D1082
FooDB (2016) The Food Metabolome Database. http://foodb.ca/. Accessed 1 Jan 2017
PhytoHub (2017). http://phytohub.eu. Accessed 1 Jan 2017
Wishart DS (2017) ContaminantDB. http://contaminantdb.ca. Accessed 15 June 2017
Wishart D, Arndt D, Pon A, Sajed T, Guo AC, Djoumbou Y et al (2015) T3DB: the toxic exposome database. Nucleic Acids Res 43(D1):D928–D934
McEachran AD, Sobus JR, Williams AJ (2017) Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard. Anal Bioanal Chem 409(7):1729–1735
Sajed T, Marcu A, Ramirez M, Pon A, Guo AC, Knox C et al (2016) ECMDB 2.0: a richer resource for understanding the biochemistry of E. coli. Nucleic Acids Res 44(D1):D495–D501
Ramirez-Gaona M, Marcu A, Pon A, Guo AC, Sajed T, Wishart NA et al (2017) YMDB 2.0: a significantly expanded version of the yeast metabolome database. Nucleic Acids Res 45(D1):D440–D445
Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V et al (2016) ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res 44(D1):D1214–D1219
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 45(D1):D353–D361
Keseler IM, Mackie A, Peralta-Gil M, Santos-Zavaleta A, Gama-Castro S, Bonavides-Martínez C et al (2013) EcoCyc: fusing model organism databases with systems biology. Nucleic Acids Res 41:D605
International Union of Biochemistry and Molecular Biology—IUBMB Nomenclature Committee Recommendations 2017. http://www.chem.qmul.ac.uk/iubmb/. Accessed 15 Apr 2017
González-Lergier J, Broadbelt LJ, Hatzimanikatis V (2005) Theoretical considerations and computational analysis of the complexity in polyketide synthesis pathways. J Am Chem Soc 127(27):9930
Wishart DS (2016) Emerging applications of metabolomics in drug discovery and precision medicine. Nat Rev Drug Discov 15(7):473–484
Allen F, Pon A, Wilson M, Greiner R, Wishart D (2014) CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra. Nucleic Acids Res 42(W1):W94–W99
Allen F, Greiner R, Wishart D (2014) Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification. Metabolomics 11(1):98–110
Allen F, Pon A, Greiner R, Wishart D (2016) Computational prediction of electron ionization mass spectra to assist in GC/MS compound identification. Anal Chem 88(15):7689–7697
Ruttkies C, Schymanski EL, Wolf S, Hollender J, Neumann S (2016) MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J Cheminform 8(1):3
Da Silva RR, Dorrestein PC, Quinn RA (2015) Illuminating the dark matter in metabolomics. Proc Natl Acad Sci U S A 112(41):12549–12550
Tian S, Djoumbou Y, Greiner R, Wishart DS (2018) CypReact: a software tool for in silico reactant prediction for human cytochrome P450 enzymes. J Chem Inf Model 58:1282–1291
Delaney KA, Kleinschmidt KC (2010) Biochemical and metabolic principles. Goldfrank’s toxicologic emergencies, 9th edn. McGraw-Hill Professional, New York, p 170
Miners JO, Smith PA, Sorich MJ, McKinnon RA, Mackenzie PI (2004) Predicting human drug glucuronidation parameters: application of in vitro and in silico modeling approaches. Annu Rev Pharmacol Toxicol 44:1–25
Jančová P, Šiller M (2012) Topics on drug metabolism. In: Paxton J (ed) Phase II drug metabolism. InTech, Croatia
Whirl-Carrillo M, McDonagh EM, Hebert JM, Gong L, Sangkuhl K, Thorn CF et al (2012) Pharmacogenomics knowledge for personalized medicine. Clin Pharmacol Ther 92(4):414–417
Spjuth O, Rydberg P, Willighagen EL, Evelo CT, Jeliazkova N (2016) XMetDB: an open access database for xenobiotic metabolism. J Cheminform 8(1):47
Preissner S, Kroll K, Dunkel M, Senger C, Goldsobel G, Kuzman D et al (2009) SuperCYP: a comprehensive database on Cytochrome P450 enzymes including a tool for analysis of CYP-drug interactions. Nucleic Acids Res 38(Suppl. 1):D237–D243
Rothwell JA, Perez-Jimenez J, Neveu V, Medina-Remón A, M’Hiri N, García-Lobato P et al. (2013) Phenol-Explorer 3.0: a major update of the Phenol-Explorer database to incorporate data on the effects of food processing on polyphenol content. Databases. https://doi.org/10.1093/database/bat070
Daylight Chemical Information Systems, Inc. (2008) SMARTS—a language for describing molecular patterns. http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html. Accessed 20 May 2009
SMIRKS (2007) A reaction transform language. http://daylight.com/dayhtml/doc/theory/theory.smirks.html. Accessed 15 Sept 2014
Djoumbou Feunang Y, Eisner R, Knox C, Chepelev L, Hastings J, Owen G et al (2016) ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J Cheminform 8(1):1–20
Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31(13):3784–3788
Placzek S, Schomburg I, Chang A, Jeske L, Ulbrich M, Tillack J et al (2017) BRENDA in 2017: new perspectives and new tools in BRENDA. Nucleic Acids Res 45(D1):D380–D388
Caspi R, Billington R, Ferrer L, Foerster H, Fulcher CA, Keseler IM et al (2016) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 44(D1):D471–D480
Bateman A, Martin MJ, O’Donovan C, Magrane M, Alpi E, Antunes R et al (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45(D1):D158–D169
Kalgutkar AS, Gardner I, Obach RS, Shaffer CL, Callegari E, Henne KR et al (2005) A comprehensive listing of bioactivation pathways of organic functional groups. Curr Drug Metab 6(3):161–225
Fenner K, Gao J, Kramer S, Ellis L, Wackett L (2008) Data-driven extraction of relative reasoning rules to limit combinatorial explosion in biodegradation pathway prediction. Bioinformatics 24(18):2079–2085
Burapan S, Kim M, Han J (2017) Demethylation of polymethoxyflavones by human gut bacterium, Blautia sp. MRG-PMF1. J Agric Food Chem 65(8):1620–1629
Selma MV, Espín JC, Tomás-Barberán FA (2009) Interaction between phenolics and gut microbiota: role in human health. J Agric Food Chem 57(15):6485–6501
Ozdal T, Sela DA, Xiao J, Boyacioglu D, Chen F, Capanoglu E (2016) The reciprocal interactions between polyphenols and gut microbiota and effects on bioaccessibility. Nutrients 8(2):78
Button WG, Judson PN, Long A, Vessey JD (2003) Using absolute and relative reasoning in the prediction of the potential metabolism of xenobiotics. J Chem Inf Comput Sci 43(5):1371–1377
Chen C-H (2013) Activation and detoxification enzymes: functions and implications. Springer, New York, pp 1–177
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A et al (2016) PubChem substance and compound databases. Nucleic Acids Res 44(D1):D1202–D1213
BIOVIA (2011) The keys to understanding MDL keyset technology. http://accelrys.com/products/pdf/keys-to-keyset-technology.pdf. Accessed 1 Oct 2012
ChemAxon’s Marvin Suite (2017). https://www.chemaxon.com/download/marvin-suite/. Accessed 15 Jan 2017
Frank E, Hall MA, Witten IH (eds) (2016) The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”, 4th edn. Morgan Kaufmann, Burlington
Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N et al (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform 9(1):33
Jeliazkova N, Kochev N (2011) AMBIT-SMARTS: efficient searching of chemical structures and fragments. Mol Inform 30(8):707–720
Wang H, Wang N, Wang B, Zhao Q, Fang H, Fu C et al (2016) Antibiotics in drinking water in Shanghai and their contribution to antibiotic exposure of school children. Environ Sci Technol 50(5):2692–2699
Cyplik P, Marecik R, Piotrowska-Cyplik A, Olejnik A, Drozdzynska A, Chrzanowski L (2012) Biological denitrification of high nitrate processing wastewaters from explosives production plant. Water Air Soil Pollut 223(4):1791–1800
Ottaviani JI, Borges G, Momma TY, Spencer JPE, Keen CL, Crozier A et al (2016) The metabolome of [2-14C](–)-epicatechin in humans: implications for the assessment of efficacy, safety, and mechanisms of action of polyphenolic bioactives. Sci Rep 6:29034
Peisl BYL, Schymanski EL, Wilmes P (2018) Dark matter in host-microbiome metabolomics: tackling the unknowns—a review. Anal Chim Acta 1037:12–27
Das A, Srinivasan M, Ghosh TS, Mande SS (2016) Xenobiotic metabolism and gut microbiomes. PLoS ONE 11(10):e0163099
Ridlon JM, Harris SC, Bhowmik S, Kang D, Hylemon PB (2016) Consequences of bile salt biotransformations by intestinal bacteria. Gut Microbes 7(1):22–39
Ghazalpour A, Cespedes I, Bennett BJ, Allayee H (2016) Expanding role of gut microbiota in lipid metabolism. Curr Opin Lipidol 27(2):141–147
Carmody RN, Turnbaugh PJ (2014) Host-microbial interactions in the metabolism of therapeutic and diet-derived xenobiotics. J Clin Invest 124(10):4173–4181