Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration
Tóm tắt
Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap") offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS]) used SSWAP to semantically describe selected data and web services. We selected high-priority Quantitative Trait Locus (QTL), genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL), Resource Description Framework (RDF) and eXtensible Markup Language (XML) documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at
http://sswap.info
. A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST). The need for semantic integration technologies has preceded available solutions. We report the feasibility of mapping high priority data from local, independent, idiosyncratic data schemas to common shared concepts as implemented in web-accessible ontologies. These mappings are then amenable for use in semantic web services. Our implementation of approximately two dozen services means that biological data at three large information resources (Gramene, SoyBase, and LIS) is available for programmatic access, semantic searching, and enhanced interaction between the separate missions of these resources.
Tài liệu tham khảo
The SoyBase Database. [http://soybase.org]
The Gramene Database. [http://www.gramene.org]
Liang C, Jaiswal P, Hebbard C, Avraham S, Buckler ES, Casstevens T, Hurwitz B, McCouch S, Ni J, Pujar A: Gramene: a growing plant comparative genomics resource. Nucl. Acids Res. 2008, 36: D947-953. 10.1093/nar/gkm968.
Gonzales MD, Archuleta E, Farmer A, Gajendran K, Grant D, Shoemaker R, Beavis WD, Waugh ME: The Legume Information System (LIS): an integrated information resource for comparative legume biology 10.1093/nar/gki128. Nucl. Acids Res. 2005, 33: D660-665. 10.1093/nar/gki128.
Building customized data pipelines using the entrez programming utilities(eUtils). [http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=coursework&part=eutils]
McWilliam H, Valentin F, Goujaon M, Li W, Narayanasamy M, Martin J, T M, Lopez R: Web services at the european bioinformatics institute - 2009. Nucl. Acids Res. 2009, 37: W6-W10. 10.1093/nar/gkp302.
Pillai S, Silventoinen V, Kallio K, Senger M, Sobhany S, Tate J, Velankar S, Golovin A, Henrick K, Rice P: SOAP-based services provided by the European Bioinformatics Institute 10.1093/nar/gki491. Nucl. Acids Res. 2005, 33: W25-28. 10.1093/nar/gki491.
Dowell RD, Jokerst RM, Day A, Eddy SR, Stein L: The distributed annotation system. BMC Bioinformatics. 2001, 2: 7-10.1186/1471-2105-2-7.
The Genomic Diversitiy and Phenotype Connection. [http://www.maizegenetics.net/gdpc/index.html]
BioCatalogue Web Site. [http://www.biocatalogue.org]
Goble CA, De Roure D: Curating scientific web services and workflows. EDUCAUSE Review. 2008, 43:
Battle R, Benson E: Bridging the semantic Web and Web 2.0 with representational state transfer (REST). Journal of Web Semantics. 2008, 6: 61-69.
Wilkinson M, Schoof H, Ernst R, Dirk H: BioMOBY successfully integrates distributed heterogeneous bioinformatics web services. The PlaNet exemplar case. Plant Physiology. 2005, 138: 5-17. 10.1104/pp.104.059170.
Agarwal V, Chafle G, Dasgupta K, Karnik N, Kumar A, Mittal S, Srivastava B: Synthy: a system for end to end composition of web services. Journal of Web Semantics. 2005, 3: 311-339.
Chafle G, Dasgupta K, Kumar A, Mittal S, Srivastava B: Adaptation in web service composition and execution. International Conference on Web Services, 2006. 2006, Chicago, IL USA, 549-557. full_text.
Jimenez-Peris R, Patino-Marinez M, Martel-Jordan E: Decentralized web service orchestration: a reflective approach. 23rd Annual ACM Symposium on Applied Computing. 2008, Fortaleza, Ceara, Brazil, 494-498.
Kirchoff BK, Pfeifer E, Rutishouser R: Plant structure ontology: how should we label plant structures with doubtful or mixed identities?. Zootaxa. 2008, 1950: 103-122.
Gessler DDG, Schiltz G, May GD, Avraham S, Town CD, Grant D, Nelson RT: SSWAP: A Simple Semantic Web Architecture and Protocol for semantic web services. BMC Bioinformatics. 2009, 10: 309-10.1186/1471-2105-10-309.
The SSWAP Web Site. [http://sswap.info]
Web Ontology Language Specification. [http://www.w3.org/TR/owl-ref/]
Resource Description Framework. [http://www.w3.org/RDF/]
RDF Schema. [http://www.w3.org/TR/rdf-schema]
XML Schema. [http://www.w3.org/XML/Schema]
SSWAP Protocol Description. [http://sswap.info/protocol.jsp]
SSWAP OWL Ontology Documentation. [http://sswapmeet.sswap.info/sswap/owldoc/index.html]
Biomedical Ontology (OBO) Foundry. [http://www.obofoundry.org]
SSWAP OWL Ontology Splitter. [http://sswap.info/splitter.jsp]
SSWAP OWL Ontology. [http://sswapmeet.sswap.info]
The Protege Ontology Editor. [http://protege.stanford.edu]
Web Services Description Language. [http://www.w3.org/TR/wsdl]
SSWAP Resource Validator and Registration Tool. [http://sswap.info/resource-validator.jsp]
SSWAP Service Example. [http://www.sswap.info/examples/README.jsp]
SSWAP Service Publication Tool. [http://sswap.info/publish.jsp]
Programmatic Interface for SSWAP Discovery Service. [http://sswap.info/sswap/resources/queryForResources/inputForm.jsp]
JENA Web Site. [http://jena.sourceforge.net]
W3C SPARQL Description. [http://www.w3.org/TR/rdf-sparql-query/]
SoyBase QTL Ontology. [http://SoyBase.org:8080/sswap/ontologies/qtl]
SoyBase Locus Ontology. [http://SoyBase.org:8080/sswap/ontologies/locus]
SoyBase SSWAP QTL Services. [http://soybase.org:8080/sswap/qtl/]
SoyBase SSWAP Locus Services. [http://soybase.org:8080/sswap/locus/]
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT: Gene Ontology: tool for the unificaiton of biology. Nature Genetics. 2000, 25: 25-29. 10.1038/75556.
Avraham S, Tung CW, Ilic K, Jaiswal P, Kellogg EA, McCouch S, Pujar A, Reiser L, Rhee SY, Sachs MM: The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations 10.1093/nar/gkm908. Nucl. Acids Res. 2008, 36: D449-454. 10.1093/nar/gkm908.
Jaiswal P, Ware D, Ni J, Chang K, Zhao W, Schmidt S, Pan X, Clark K, Teytelman L, Cartinhour S: Gramene: Development and Integration of Trait and Gene Ontologies for Rice. Comparative and Functional Genomics. 2002, 3: 132-136. 10.1002/cfg.156.
Nucleic Acids Research Database Issue. [http://www.oxfordjournals.org/nar/database/c/]
Nucleic Acids Research Web Server Issue. [http://www.oxfordjournals.org/nar/webserver/c/]