Collaborative development of predictive toxicology applications

Springer Science and Business Media LLC - Tập 2 - Trang 1-29 - 2010
Barry Hardy1, Nicki Douglas1, Christoph Helma2, Micha Rautenberg2, Nina Jeliazkova3, Vedrin Jeliazkov3, Ivelina Nikolova3, Romualdo Benigni4, Olga Tcheremenskaia4, Stefan Kramer5, Tobias Girschick5, Fabian Buchwald5, Joerg Wicker5, Andreas Karwath6, Martin Gütlein6, Andreas Maunz6, Haralambos Sarimveis7, Georgia Melagraki7, Antreas Afantitis7, Pantelis Sopasakis7, David Gallagher8, Vladimir Poroikov9, Dmitry Filimonov9, Alexey Zakharov9, Alexey Lagunin9, Tatyana Gloriozova9, Sergey Novikov9, Natalia Skvortsova9, Dmitry Druzhilovsky9, Sunil Chawla10, Indira Ghosh11, Surajit Ray11, Hitesh Patel11, Sylvia Escher12
1Douglas Connect, Zeiningen, Switzerland
2In silico Toxicology, Basel, Switzerland
3Ideaconsult Ltd, Sofia, Bulgaria
4Istituto Superiore di Sanità, Environment and Health Department, Istituto Superiore di Sanita', Rome, Italy
5Technical University of Munich, Technische Universität München, München, Germany
6Albert-Ludwigs University Freiburg, Freiburg i.Br, Germany
7National Technical University of Athens, School of Chemical Engineering, Zographou, Greece
8David Gallagher, Congresbury, UK
9Institute of Biomedical Chemistry of Russian Academy of Sciences, Moscow, Russia
10Seascape Learning, New Delhi, India
11Jawaharlal Nehru University, New Delhi, India
12Fraunhofer Institute for Toxicology & Experimental Medicine, Hannover, Germany

Tóm tắt

OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals. The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation. Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.

Tài liệu tham khảo

ECB Report on REACH Testing Needs. [http://ecb.jrc.ec.europa.eu/documents/REACH/PUBLICATIONS/REACH_testing_needs_final.pdf] OpenTox. [http://www.opentox.org] PubChem. [http://pubchem.ncbi.nlm.nih.gov/] Filimonov D, Poroikov V: Why relevant chemical information cannot be exchanged without disclosing structures. J Comput Aided Molec Des. 2005, 19: 705-713. 10.1007/s10822-005-9014-2. ECB QSAR Model Reporting Format (QMRF). [http://ecb.jrc.ec.europa.eu/qsar/qsar-tools/index.php?c=QRF] Distributed Structure-Searchable Toxicity (DSSTox) Database Network. [http://www.epa.gov/ncct/dsstox/index.html] Leadscope's ToXML schema. [http://www.leadscope.com/toxml.php] Steinbeck C, Han YQ, Kuhn S, Horlacher O, Luttmann E, Willighagen EL: The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics. J Chem Inf Comp Sci. 2003, 43: 493-500. International Chemical Identifier InChI. [http://inchi.info/] OECD Validation Principles. [http://www.oecd.org/dataoecd/33/37/37849783.pdf] Guidelines for the Testing of Chemicals. [http://www.oecd.org/document/40/0,3343,en_2649_34377_37051368_1_1_1_1,00.html] ECHA Guidance on Information Requirements and Chemical Safety Assessment, Part B: Hazard Assessment. [http://guidance.echa.europa.eu/docs/guidance_document/information_requirements_en.htm#B] Lazar Toxicity Predictions. [http://lazar.in-silico.de/] Algorithm Webservice for Opentox. [http://github.com/helma/opentox-algorithm] Isar - Intelligible Semi-Automated Reasoning. [http://isabelle.in.tum.de/Isar/] Zhu H, Tropsha A, Fourches D, Varnek A, Papa E, Gramatical P, Öberg T, Dao P, Cherkasov A, Tetko IV: Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J Chem Inf Model. 2008, 48 (4): 766-784. 10.1021/ci700443v. ZINC database. [http://zinc.docking.org/] WEKA Data Mining Software. [http://www.cs.waikato.ac.nz/ml/weka/] Taverna Workbench. [http://www.taverna.org.uk/] OpenBabel: The Open Source Chemistry Toolbox. [http://openbabel.org/wiki/Main_Page] Minimum Requirements for OpenTox Components. [http://www.opentox.org/dev/framework/minreqs] OpenTox Development: The Developers' Area of the OpenTox Project. [http://www.opentox.org/dev] Current Specifications of OpenTox Interfaces. [http://www.opentox.org/dev/apis] Fielding RT: "Architectural Styles and the Design of Network-based Software Architectures". PhD thesis. 2000, University of California, Irvine SPARQL Query Language for RDF: W3C Recommendation 15 January 2008. [http://www.w3.org/TR/rdf-sparql-query/] Resource Description Framework (RDF). [http://www.w3.org/RDF/] Introducing JSON (JavaScript Object Notation). [http://www.json.org/] YAML: YAML Ain't Markup Language. [http://www.yaml.org/] Turtle - Terse RDF Triple Language. [http://www.w3.org/TeamSubmission/turtle/] Predictive Model Markup Language, Data Mining Group. [http://www.dmg.org/] OECD Harmonised Templates. [http://www.oecd.org/document/13/0,3343,en_2649_34365_36206733_1_1_1_1,00.html] Lilienblum W, Dekant W, Foth H, Gebel T, Hengstler JG, Kahl R, Kramer PJ, Schweinfurth H, Wollin KM: Alternative methods to safety studies in experimental animals: role in the risk assessment of chemicals under the new European Chemicals Legislation (REACH). Regulat Toxicol. 2008, 82: 211-236. Chapter 2 of the Manual for Investigation of High Production Volume (HPV) Chemicals. [http://www.oecd.org/document/7/0,3343,en_2649_34379_1947463_1_1_1_1,00.html] Test Methods Regulation. [http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2008:142:0001:0739:EN:PDF] International Agency for Research on Cancer (IARC) websites. [http://www.iarc.fr/en/websites/index.php] National Toxicology Program: Database Search Application. [http://ntp-apps.niehs.nih.gov/ntp_tox/index.cfm] Chemical Carcinogens: Structures and Experimental Data (ISSCAN). [http://www.iss.it/ampp/dati/cont.php?id=233&lang=1&tipo=7] Ambit. [http://ambit.sourceforge.net/] Bitsch A, Jacobi S, Wahnschaffe U, Simetska N, Mangelsdorf I: REPDOSE: A database on repeated dose toxicity studies of commercial chemicals - A multifunctional tool. Regul Toxicol Pharm. 2006, 46 (3): 202-210. 10.1016/j.yrtph.2006.05.013. ACToR: Aggregated Computational Toxicology Resource. [http://actor.epa.gov/actor/] Search EURAS Bioconcentration Factor (BCF) Gold Standard Database. [http://ambit.sourceforge.net/euras/] National Center for Computational Toxicology (NCCT): ToxCast - Predicting Hazard, Characterizing Toxicity Pathways, and Prioritizing Toxicity Testing of Environmental Chemicals. [http://www.epa.gov/ncct/toxcast/] Benigni R, Bossa C, Jeliazkova N, Netzeva T, Worth A: The Benigni/Bossa rulebase for mutagenicity and carcinogenicity - a module of Toxtree. JRC Scientific and Technical Reports. 2008, [http://ecb.jrc.it/documents/QSAR/EUR_23241_EN.pdf] Prediction of Activity Spectra for Substances (PASS). [http://195.178.207.233/PASS/] OpenTox Community Resource for Toxicology Vocabulary and Ontology: OpenToxipedia. [http://www.opentox.org/opentoxipedia] Description of Current OpenTox Components. [http://www.opentox.org/dev/documentation/components] OWL Web Ontology Language Overview. [http://www.w3.org/TR/owl-features/] OpenTox Algorithm Types in OWL format. [http://opentox.org/data/documents/development/RDF files/AlgorithmTypes/view] Descriptor Ontology. [http://qsar.svn.sf.net/viewvc/qsar/trunk/qsar-dicts/descriptor-ontology.owl?revision=218] Dublin Core Metadata Initiative. [http://dublincore.org/] ECHA Guidance on Information Requirements and Chemical Safety Assessment. Chapter R.6: QSARs and Grouping of Chemicals. [http://guidance.echa.europa.eu/docs/guidance_document/information_requirements_r6_en.pdf?vers=20_08_08] OpenTox API for Handling Applicability Domain Estimation of (Q)SAR Models. [http://www.opentox.org/dev/apis/api-1.1/ApplicabilityDomain] AMBIT QMRF editor. [http://ambit.sourceforge.net/qmrf/jws/qmrfeditor.jnlp] List of OpenTox Validation URIs. [http://opentox.informatik.uni-freiburg.de/validation] Test Deployment of Validation Webservice for the OpenTox Project. [http://opentox.informatik.uni-freiburg.de/] In Silico Toxicology Website. [http://www.in-silico.ch/] Three Rs (Reduction, Refinement and Replacement) Declaration of Bologna, adopted by the 3rd World Congress on Alternatives and Animal Use in the Life Sciences, Bologna, Italy, August 31st. 1999, [http://www.ncbi.nlm.nih.gov/pubmed/19731463] Russell WMS, Burch RL: The Principles of Humane Experimental Technique. 1959, London: Methuen & Co. Ltd. ECHA List of Pre-registered Substances. [http://apps.echa.europa.eu/preregistered/pre-registered-sub.aspx] EC Chemical Inventories. [http://ecb.jrc.ec.europa.eu/home.php?CONTENU=/DOCUMENTS/QSAR/INFORMATION_SOURCES/EC_CHEMICAL_INVENTORIES/] ECETOC: Skin irritation and corrosion Reference Chemicals data base. ECETOC Technical Report No. 66. 1995, European Center for Ecotoxicology and Toxicology of Chemicals, Brussels, Belgium Gerberick GF, Ryan CA, Kern PS, Schlatter H, Dearman RJ, Kimber I, Patlewicz G, Basketter DA: Compilation of historical local lymph node assay data for the evaluation of skin sensitization alternatives. Dermatitis. 2005, 16 (4): 157-202. Chemical Identifier Resolver beta 3. [http://cactus.nci.nih.gov/chemical/structure] ChemIDplus Advanced. [http://chem.sis.nlm.nih.gov/chemidplus/] OpenTox Dataset API Description. [http://www.opentox.org/dev/apis/api-1.1/dataset] Daylight Chemical Information Systens, Inc.: Simplified Molecular Input Line Entry System (SMILES). [http://www.daylight.com/smiles/] Hartung T, Rovida C: Chemical regulators have overreached. Nature. 2009, 460: 1080-1081. 10.1038/4601080a. Hardy B, Elkin P, Averback J, Fontaine AL, Kahn S: Improving Confidence in Safety in Clinical Drug Development: The Science of Knowledge Management. The Monitor. 2006, 37-41. Hardy B: Linking Trust, Change, Leadership & Innovation - Ingredients of a Knowledge Leadership Support Framework. KM Review. 2007, 10 (5): 18-23. Hardy B: Collaboration, Culture, and Technology - Contributions to Confidence in Leadership Support. KM Review. 2008, 10 (6): 18-23. Current OpenTox Services. [http://www.opentox.org/toxicity-prediction] Wikipedia Article on "Ontology". [http://en.wikipedia.org/wiki/Ontology] McGuinness DL: Ontologies come of age. Spinning the semantic web: bringing the World Wide Web to its full potential. Edited by: Fensel D, Hendler J, Lieberman H, Wahlster W. 2003, MIT Press, Boston Wikipedia Article on the term "Ontology" in the field of Information Science. [http://en.wikipedia.org/wiki/Ontology_(information_science)] Richard AM, Yang C, Judson RS: Toxicity Data Informatics: Supporting a New Paradigm for Toxicity Prediction. Toxicol Mech Method. 2008, 18: 103-11. 10.1080/15376510701857452. Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network: Coordinating Public Efforts. [http://www.epa.gov/ncct/dsstox/CoordinatingPublicEfforts.html] Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network: EPAFHM: EPA Feathed Minnow Acute Toxicity Database. [http://www.epa.gov/ncct/dsstox/sdf_epafhm.html] Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network: NCTRER: FDA National Center for Toxicological Research Estrogen Receptor Binding Database. [http://www.epa.gov/ncct/dsstox/sdf_nctrer.html] Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network: CPDBAS: Carcinogenic Potency Database Summary Tables - All Species. [http://www.epa.gov/NCCT/dsstox/sdf_cpdbas.html] Yan X, Han J: gSpan: Graph-based substructure pattern mining. Proceedings of the 2002 IEEE international Conference on Data Mining: December 09-12, 2002; Maebashi, Japan. 2002, ICDM. IEEE Computer Society, Washington, DC, 721-724. Jahn K, Kramer S: Optimizing gSpan for Molecular Datasets. Proceedings of the Third International Workshop on Mining Graphs, Trees and Sequences (MGTS-2005). Edited by: Nijssen S, Meinl T, Karypis, G. 2005, 77-89. MakeMNA Descriptor Calculation. [http://www.opentox.org/dev/documentation/components/makemna] MakeQNA Descriptor Calculation. [http://www.opentox.org/dev/documentation/components/makeqna] A Java Based Cheminformatics (Computational Chemistry) Library. [http://www.ra.cs.uni-tuebingen.de/software/joelib/index.html] Tutorial for the JOELib Java Based Cheminformatics (Computational Chemistry) Library. [http://www.ra.cs.uni-tuebingen.de/software/joelib/tutorial/JOELibTutorial.pdf] Breiman L, Friedman JH, Olshen RA, Stone CJ: Classification and Regression Trees. 1984, Belmont, CA: Wadsworth Wikipedia Article on the k-Nearest Neighbor Algorithm (kNN). [http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm] OpenTox Component: J48 Algorithm. [http://www.opentox.org/dev/documentation/components/j48] OpenTox Component: Partial-Least Squares Regression (PLS). [http://www.opentox.org/dev/documentation/components/pls] Wikipedia Article on Support Vector Machines (SVM). [http://en.wikipedia.org/wiki/Support_vector_machine] Patlewicz G, Jeliazkova N, Safford RJ, Worth AP, Aleksiev B: An evaluation of the implementation of the Cramer classification scheme in the Toxtree software. SAR QSAR Environ Res. 2008, 19 (5-6): 495-524. 10.1080/10629360802083871. Rasmussen CE, Williams CKI: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). 2005, Massachusetts, USA: The MIT Press OpenTox Component: M5P. [http://www.opentox.org/dev/documentation/components/m5p] Prakash O, Ghosh I: Developing an Antituberculosis Compounds Database and Data Mining in the Search of a Motif Responsible for the Activity of a Diverse Class of Antituberculosis Agents. J Chem Inf Model. 2006, 46: 17-23. 10.1021/ci050115s. Wikipedia Article on K-Means clustering. [http://en.wikipedia.org/wiki/K-means_clustering] Liu H, Setiono R: Chi2: Feature selection and discretization of numeric attributes. Proceedings of the 7th International Conference on Tools with Artificial Intelligence: November 05-08 1995; IEEE. 1995, 338-391. OpenTox Algorithms Ontology. [http://www.opentox.org/dev/apis/api-1.1/Algorithms] OpenTox Testing Procedures. [http://www.opentox.org/dev/testing] Common Methods for HTTP/1.1. [http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html] cURL Homepage. [http://curl.haxx.se/] QSAR Prediction Reporting Format (QPRF). [http://ecb.jrc.ec.europa.eu/qsar/qsar-tools/qrf/QPRF_version_1.1.pdf] REACH Guidance on Information Requirements and Chemical Safety Assessment. [http://guidance.echa.europa.eu/docs/guidance_document/information_requirements_en.htm] ECHA Guidance on Information Requirements and Chemical Safety Assessment, Part F: Chemicals Safety Report. [http://guidance.echa.europa.eu/docs/guidance_document/information_requirements_part_f_en.pdf?vers=30_07_08] ECHA Guidance on Information Requirements and Chemical Safety Assessment, Appendix Part F, including the template. [http://guidance.echa.europa.eu/docs/guidance_document/information_requirements_appendix_part_f_en.pdf?vers=30_07_08] QSAR Reporting Formats and JRC QSAR Model Database. [http://ecb.jrc.ec.europa.eu/qsar/qsar-tools/index.php?c=QRF] OpenOffice - The Free and Open Productivity Suite. [http://www.openoffice.org/] MySQL. [http://www.mysql.com/] CACTUS Chemical Structure Lookup Service 2008. [http://cactus.nci.nih.gov/cgi-bin/lookup/search] CADD Group Cheminformatics Tools and User Services (CACTUS). [http://cactus.nci.nih.gov/] United States National Library of Medicine. [http://www.nlm.nih.gov/] TOXNET - Toxicology Data Network. [http://toxnet.nlm.nih.gov/] CambridgeSoft Desktop Software ChemDraw. [http://www.cambridgesoft.com/software/ChemDraw/] CambridgeSoft - Life Science Enterprise Solutions. [http://www.cambridgesoft.com/] European Commission Joint Research Centre: Institute for Health and Consumer Protection, Computational Toxicology Group. [http://ecb.jrc.ec.europa.eu/qsar/about-the-group/] Advanced Chemistry Development ACD/Name: Generate Chemical Nomenclature from Structure. [http://www.acdlabs.com/products/draw_nom/nom/name/] United States National Center for Computational Toxicology (NCCT). [http://www.epa.gov/ncct] Distributed Structure-Searchable Toxicity (DSStox) Public Database Network: DBCAN: EPA Water Disinfection By-Products with Carcinogenicity Estimates. [http://www.epa.gov/ncct/dsstox/sdf_dbpcan.html] Woo YT, Lai D, McLain JL, Manibusan MK, Dellarco V: Use of Mechanism-Based Structure-Activity Relationships Analysis in Carcinogenic Potential Ranking for Drinking Water Disinfection By-Products. Environ Health Perspect. 2002, 110 (suppl 1): 75-87. (2002) United States Environmental Protection Agency, Mid-Continent Ecology Division. [http://www.epa.gov/med] Distributed Structure-Searchable Toxicity (DSStox) Public Database Network: Central Field Database Entry: MOA. [http://www.epa.gov/ncct/dsstox/CentralFieldDef.html#MOA] Distributed Structure-Searchable Toxicity (DSStox) Public Database Network: KIERBL: EPA Estrogen Receptor Ki Binding Study Database. [http://www.epa.gov/ncct/dsstox/sdf_kierbl.html] United States Environmental Protection Agency Endocrine Disruptor Screening Program (EDSP). [http://www.epa.gov/scipoly/oscpendo/] Laws SC, Yavanhxay S, Copper RL, Eldridge JC: Nature of the binding interaction for 50 structurally diverse chemicals with rat estrogen receptors. Toxicol Sci. 2006, 94 (1): 46-56. 10.1093/toxsci/kfl092. United States Environmental Protection Agency: Summary of the Toxic Substances Control Act. [http://www.epa.gov/lawsregs/laws/tsca.html] Distributed Structure-Searchable Toxicity (DSStox) Public Database Network: IRISTR: EPA Integrated Risk Information System (IRIS) Toxicity Reveiw Data. [http://www.epa.gov/ncct/dsstox/sdf_iristr.html] Distributed Structure-Searchable Toxicity (DSStox) Public Database Network: FDAMDD: FDA Maximum (Recommended) Daily Dose. [http://www.epa.gov/ncct/dsstox/sdf_fdamdd.html] United States Food and Drug Administration: Drugs. [http://www.fda.gov/Drugs/default.htm] Wikipedia Article on "Martindale: The complete drug reference". The Extra Pharmacopoeia (1973, 1983, and 1993) and The Physicians' Desk Reference (1995 and 1999). [http://en.wikipedia.org/wiki/Martindale:_The_Extra_Pharmacopoeia]