COCONUT online: Collection of Open Natural Products database

Maria Sorokina1, Peter Merseburger1, Kohulan Rajan1, Mehmet Aziz Yirik1, Christoph Steinbeck1
1Institute for Inorganic and Analytical Chemistry, University Friedrich-Schiller, Lessing Strasse 8, 07743, Jena, Germany

Tóm tắt

AbstractNatural products (NPs) are small molecules produced by living organisms with potential applications in pharmacology and other industries as many of them are bioactive. This potential raised great interest in NP research around the world and in different application fields, therefore, over the years a multiplication of generalistic and thematic NP databases has been observed. However, there is, at this moment, no online resource regrouping all known NPs in just one place, which would greatly simplify NPs research and allow computational screening and other in silico applications. In this manuscript we present the online version of the COlleCtion of Open Natural prodUcTs (COCONUT): an aggregated dataset of elucidated and predicted NPs collected from open sources and a web interface to browse, search and easily and quickly download NPs. COCONUT web is freely available at https://coconut.naturalproducts.net.

Từ khóa


Tài liệu tham khảo

Sorokina M, Steinbeck C (2020) Review on natural products databases: where to find data in 2020. J Cheminform 12:20. https://doi.org/10.1186/s13321-020-00424-9 [cito:citesForInformation]

ZINC natural products subset. https://zinc15.docking.org/substances/subsets/natural-products/. Accessed 19 Nov 2020 [cito:usesDataFrom]

Banerjee P, Erehman J, Gohlke B-O, Wilhelm T, Preissner R, Dunkel M (2015) Super Natural II—a database of natural products. Nucleic Acids Res 43:D935–D939. https://doi.org/10.1093/nar/gku886 [cito:usesDataFrom]

van Santen JA, Jacob G, Singh AL, Aniebok V, Balunas MJ, Bunsko D et al (2019) The natural products atlas: an open access knowledge base for microbial natural products discovery. ACS Cent Sci 5:1824–1833. https://doi.org/10.1021/acscentsci.9b00806 [cito:usesDataFrom]

Pilon AC, Valli M, Dametto AC, Pinto MEF, Freire RT, Castro-Gamboa I (2017) NuBBEDB: an updated database to uncover chemical and biological information from Brazilian biodiversity. Sci Rep 7:7215. https://doi.org/10.1038/s41598-017-07451-x [cito:usesDataFrom]

Nakamura K, Shimura N, Otabe Y, Hirai-Morita A, Nakamura Y, Ono N (2013) KNApSAcK-3D: a three-dimensional structure database of plant metabolites. Plant Cell Physiol 54:e4–e4. https://doi.org/10.1093/pcp/pcs186 [cito:usesDataFrom]

Zeng X, Zhang P, Wang Y, Qin C, Chen S, He W (2019) CMAUP: a database of collective molecular activities of useful plants. Nucleic Acids Res 47:D1118–27 [cito:usesDataFrom]

Chen CY-C (2011) TCM Database: the World’s Largest Traditional Chinese Medicine Database for Drug Screening in silico. PLOS ONE 6:e15939. https://doi.org/10.1371/journal.pone.0015939 [cito:usesDataFrom]

FooDB. http://foodb.ca/. Accessed 3 Oct 2019 [cito:usesDataFrom]

Chávez-Hernández AL, Sánchez-Cruz N, Medina-Franco JL (2020) A fragment library of natural products and its comparative chemoinformatic characterization. Mol Inform 39:2000050. https://doi.org/10.1002/minf.202000050 [cito:citesForInformation]

Chávez-Hernández AL, Sánchez-Cruz N, Medina-Franco JL (2020) Fragment library of natural products and compound databases for drug discovery. Biomolecules 10:1518. https://doi.org/10.3390/biom10111518 [cito:citesForInformation]

Bento AP, Hersey A, Félix E, Landrum G, Gaulton A, Atkinson F (2020) An open source chemical structure curation pipeline using RDKit. J Cheminform 12:51. https://doi.org/10.1186/s13321-020-00456-1 [cito:usesMethodIn]

Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform 9:33. https://doi.org/10.1186/s13321-017-0220-4 [cito:usesMethodIn]

Djoumbou Feunang Y, Eisner R, Knox C, Chepelev L, Hastings J, Owen G (2016) ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J Cheminform 8:61. https://doi.org/10.1186/s13321-016-0174-y [cito:usesMethodIn]

Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893. https://doi.org/10.1021/jm9602928 [cito:usesMethodIn]

Fritsch S, Neumann S, Schaub J, Steinbeck C, Zielesny A (2019) ErtlFunctionalGroupsFinder: automated rule-based functional group detection with the Chemistry Development Kit (CDK). J Cheminform 11:37. https://doi.org/10.1186/s13321-019-0361-8 [cito:usesMethodIn]

O’Boyle N, Dalke A (2018) DeepSMILES: an adaptation of SMILES for use in machine-learning of chemical structures. https://doi.org/10.26434/chemrxiv.7097960.v1 [cito:usesMethodIn]

Hastings J, de Matos P, Dekker A, Ennis M, Harsha B, Kale N (2013) The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res 41:D456–D463. https://doi.org/10.1093/nar/gks1146

Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D (2017) The ChEMBL database in 2017. Nucleic Acids Res 45:D945–D954. https://doi.org/10.1093/nar/gkw1074 [cito:usesDataFrom]

ChemAxon (2012) JChem Base was used for structure searching and chemical database access and management. http://www.chemaxon.com.

Schaub J, Zielesny A, Steinbeck C, Sorokina M (2020) Too sweet: cheminformatics for deglycosylation in natural products. J Cheminform 12:67. https://doi.org/10.1186/s13321-020-00467-y [cito:usesMethodIn]

Ertl P, Roggo S, Schuffenhauer A (2008) Natural product-likeness score and its application for prioritization of compound libraries. J Chem Inf Model 48:68–74. https://doi.org/10.1021/ci700286x [cito:usesMethodIn]

Sorokina M, Steinbeck C (2019) NaPleS: a natural products likeness scorer—web application and database. J Cheminformatics. https://doi.org/10.1186/s13321-019-0378-z [cito:usesMethodIn]

Kim H, Wang M, Leber C, Nothias L-F, Reher R, Kang KB, et al. (2020) NPClassifier: a Deep Neural Network-Based Structural Classification Tool for Natural Products. https://doi.org/10.26434/chemrxiv.12885494.v1 [cito:usesMethodIn]

Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A (2016) PubChem Substance and Compound databases. Nucleic Acids Res 44:D1202–D1213. https://doi.org/10.1093/nar/gkv951 [cito:usesMethodIn]

React – A JavaScript library for building user interfaces. https://reactjs.org/. Accessed 21 Aug 2020 [cito:usesMethodIn]

OpenChemLib (https://github.com/cheminfo/openchemlib-js). JavaScript (2020) https://github.com/cheminfo/openchemlib-js. Accessed 21 Aug 2020 [cito:usesMethodIn]

$bitsAllSet — MongoDB Manual. https://github.com/mongodb/docs/blob/master/source/reference/operator/query/bitsAllSet.txt. https://docs.mongodb.com/manual/reference/operator/query/bitsAllSet. Accessed 21 Aug 2020 [cito:usesMethodIn]

Ullmann (cdk 2.3 API). http://cdk.github.io/cdk/latest/docs/api/index.html. Accessed 21 Aug 2020 [cito:usesMethodIn]

Cordella LP, Foggia P, Sansone C, Vento M (2004) A (sub)graph isomorphism algorithm for matching large graphs IEEE Trans Pattern Anal Mach Intell 26:1367–1372. [cito:usesMethodIn]

DfPattern (cdk 2.3 API). http://cdk.github.io/cdk/latest/docs/api/index.html. Accessed 28 Sep 2020 [cito:usesMethodIn]

Michał. LSH-based similarity search in MongoDB is faster than postgres cartridge. THE CHEMBL-OG The Organization of Drug Discovery Data. http://chembl.blogspot.com/2015/08/lsh-based-similarity-search-in-mongodb.html. Accessed 21 Aug 2020 [cito:usesMethodIn]

Ntie-Kang F, Nwodo JN, Ibezim A, Simoben CV, Karaman B, Ngwa VF (2014) Molecular modeling of potential anticancer agents from African medicinal plants. J Chem Inf Model 54:2433–2450. https://doi.org/10.1021/ci5003697

Ntie-Kang F, Zofou D, Babiaka SB, Meudom R, Scharfe M, Lifongo LL (2013) AfroDb: a select highly potent and diverse natural product library from African medicinal plants. PLoS ONE 8:e78085

Onguéné PA, Ntie-Kang F, Mbah JA, Lifongo LL, Ndom JC, Sippl W (2014) The potential of anti-malarial compounds derived from African medicinal plants, part III: an in silico evaluation of drug metabolism and pharmacokinetics profiling. Org Med Chem Lett 4:6. https://doi.org/10.1186/s13588-014-0006-x

AnalytiCon Discovery, Screening Libraries. In: AnalytiCon Discovery. https://ac-discovery.com/screening-libraries/. Accessed 16 Oct 2020

Pilón-Jiménez BA, Saldívar-González FI, Díaz-Eufracio BI, Medina-Franco JL (2019) BIOFACQUIM: a Mexican compound database of natural products. Biomolecules 9:31. https://doi.org/10.3390/biom9010031

Dagan-Wiener A, Di Pizio A, Nissim I, Bahia MS, Dubovski N, Margulis E (2019) BitterDB: taste ligands and receptors database in 2019. Nucleic Acids Res 47:D1179–D1185. https://doi.org/10.1093/nar/gky974

Yabuzaki J (2017) Carotenoids Database: structures, chemical fingerprints and distribution among organisms. Database J Biol Databases Curation. https://doi.org/10.1093/database/bax004

Pence HE, Williams A (2010) ChemSpider: an online chemical information resource. J Chem Educ 87:1123–1124. https://doi.org/10.1021/ed100697w

Ntie-Kang F, Amoa Onguéné P, Scharfe M, Owono LCO, Megnassan E, Meva’a Mbaze L (2014) ConMedNP: a natural product library from Central African medicinal plants for drug discovery. RSC Adv 4:409–419. https://doi.org/10.1039/c3ra43754j

Bultum LE, Woyessa AM, Lee D (2019) ETM-DB: integrated Ethiopian traditional herbal medicine and phytochemicals database. BMC Complement Altern Med 19:212. https://doi.org/10.1186/s12906-019-2634-1

Neveu V, Moussy A, Rouaix H, Wedekind R, Pon A, Knox C (2017) Exposome-Explorer: a manually-curated database on biomarkers of exposure to dietary and environmental factors. Nucleic Acids Res 45:D979–D984. https://doi.org/10.1093/nar/gkw980

Wang M, Carver JJ, Phelan VV, Sanchez LM, Garg N, Peng Y (2016) Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 34:828. https://doi.org/10.1038/nbt.3597

Kang H, Tang K, Liu Q, Sun Y, Huang Q, Zhu R (2013) HIM-herbal ingredients in vivo metabolism database. J Cheminform 5:28. https://doi.org/10.1186/1758-2946-5-28

Ye H, Ye L, Kang H, Zhang D, Tao L, Tang K (2011) HIT: linking herbal active ingredients to targets. Nucleic Acids Res 39:D1055–D1059 https://doi.org/10.1093/nar/gkq1165

NDOFINE Chemical Company. http://www.indofinechemical.com/Media/sdf/sdf_files.aspx. Accessed 16 Oct 2019

Zhang R, Lin J, Zou Y, Zhang X-J, Xiao W-L (2019) Chemical space and biological target network of anti-inflammatory natural products, J Chem Inf Model 59:66–73. https://doi.org/10.1021/acs.jcim.8b00560

Vetrivel U, Subramanian N, Pilla K (2009) InPACdb—Indian plant anticancer compounds database. Bioinformation 4:71–74

InterBioScreen | Natural Compounds. https://www.ibscreen.com/natural-compounds. Accessed 9 Oct 2019

Lichen Database. In: MTBLS999: A database of high-resolution MS/MS spectra for lichen metabolites. https://www.ebi.ac.uk/metabolights/MTBLS999. Accessed 16 Oct 2019

Gentile D, Patamia V, Scala A, Sciortino MT, Piperno A, Rescifina A (2020) Putative inhibitors of SARS-CoV-2 main protease from a library of marine natural products: a virtual screening and molecular modeling study. Marine Drugs 18:225. https://doi.org/10.3390/md18040225

Derese S, Oyim J, Rogo M, Ndakala A (2015) Mitishamba database: a web based in silico database of natural products from Kenya plants. Nairobi, University of Nairobi

Ntie-Kang F, Telukunta KK, Döring K, Simoben CV, Moumbock AF, Malange YI (2017) NANPDB: a resource for natural products from Northern African sources. J Nat Prod 80:2067–2076. https://doi.org/10.1021/acs.jnatprod.7b00283

Compound Sets—NCI DTP Data—National Cancer Institute—Confluence Wiki. https://wiki.nci.nih.gov/display/NCIDTPdata/Compound+Sets. Accessed 18 Oct 2019

Mangal M, Sagar P, Singh H, Raghava GPS, Agarwal SM (2013) NPACT: naturally occurring plant-based anti-cancer compound-activity-target database. Nucleic Acids Res 41:D1124–D1129. https://doi.org/10.1093/nar/gks1047

Zeng X, Zhang P, He W, Qin C, Chen S, Tao L (2018) NPASS: natural product activity and species source database for natural product research, discovery and tool development. Nucleic Acids Res 46:D1217–D1222. https://doi.org/10.1093/nar/gkx1026

Choi H, Cho SY, Pak HJ, Kim Y, Choi J, Lee YJ (2017) NPCARE: database of natural products and fractional extracts for cancer regulation. J Cheminformatics 9:2. https://doi.org/10.1186/s13321-016-0188-5

Tomiki T, Saito T, Ueki M, Konno H, Asaoka T, Suzuki R (2006) RIKEN natural products encyclopedia (RIKEN NPEdia), a chemical database of RIKEN natural products depository (RIKEN NPDepo). J Comput Aid Chem 7:157–162

Ntie-Kang F, Onguéné PA, Fotso GW, Andrae-Marobela K, Bezabih M, Ndom JC (2014) Virtualizing the p-ANAPL library: a step towards drug discovery from African medicinal plants. PLoS ONE 9:e90655. https://doi.org/10.1371/journal.pone.0090655

Rothwell JA, Perez-Jimenez J, Neveu V, Medina-Remón A, M’Hiri N, García-Lobato P (2013) Phenol-Explorer 3.0: a major update of the Phenol-Explorer database to incorporate data on the effects of food processing on polyphenol content. Database. https://doi.org/10.1093/database/bat070

Sawada Y, Nakabayashi R, Yamada Y, Suzuki M, Sato M, Sakata A (2012) RIKEN tandem mass spectral database (ReSpect) for phytochemicals: a plant-specific MS/MS-based data resource and database. Phytochemistry 82:38–45. https://doi.org/10.1016/j.phytochem.2012.07.007

Hatherley R, Brown DK, Musyoka TM, Penkler DL, Faya N, Lobb KA (2015) SANCDB: a South African natural compound database. J Cheminformatics 7:29. https://doi.org/10.1186/s13321-015-0080-8

Davis GDJ, Vasanthi AHR (2011) Seaweed metabolite database (SWMD): a database of natural compounds from marine algae. Bioinformation 5:361–364.

Specs. Compound management services and research compounds for the life science industry. https://www.specs.net/index.php. Accessed 16 Oct 2019

Fischedick JT, Johnson SR, Ketchum REB, Croteau RB, Lange BM (2015) NMR spectroscopic search module for Spektraris, an online resource for plant natural product identification—Taxane diterpenoids from Taxus × media cell suspension cultures as a case study. Phytochemistry 113:87–95. https://doi.org/10.1016/j.phytochem.2014.11.020

Moumbock AFA, Gao M, Qaseem A, Li J, Kirchner PA, Ndingkokhar B (2020) StreptomeDB 3.0: an updated compendium of streptomycetes natural products. Nucleic Acids Res. https://doi.org/10.1093/nar/gkaa868

TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531123/. Accessed 29 Apr 2019

Tung C-W, Lin Y-C, Chang H-S, Wang C-C, Chen I-S, Jheng J-L (2014) TIPdb-3D: the three-dimensional structure database of phytochemicals from Taiwan indigenous plants. Database. https://doi.org/10.1093/database/bau055

ünthardt BF, Hollender J, Hungerbühler K, Scheringer M, Bucheli TD (2018) Comprehensive toxic plants-phytotoxins database and its application in assessing aquatic micropollution potential. J Agric Food Chem 66:7577–7588. https://doi.org/10.1021/acs.jafc.8b01639

UEFS Natural Products. http://zinc12.docking.org/catalogs/uefsnp. Accessed 6 Nov 2019

Gu J, Gui Y, Chen L, Yuan G, Lu H-Z, Xu X (2013) Use of natural products as chemical library for drug discovery and network pharmacology. PLoS ONE 8:e62839. https://doi.org/10.1371/journal.pone.0062839

Nguyen-Vo T-H, Le T, Pham D, Nguyen T, Le P, Nguyen A (2019) VIETHERB: a database for Vietnamese herbal species. J Chem Inf Model 59:1–9. https://doi.org/10.1021/acs.jcim.8b00399

Sterling T, Irwin JJ (2015) ZINC 15—ligand discovery for everyone. J Chem Inf Model 55:2324–2337. https://doi.org/10.1021/acs.jcim.5b00559