InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data

Bioinformatics (Oxford, England) - Tập 28 Số 23 - Trang 3163-3165 - 2012
Richard Smith1, J. Aleksić, Daniela Butano, Adrian R. Carr, Sergio Contrino, Fengyuan Hu, Mike Lyne, Rachel Lyne, Alex Kalderimis, Kim Rutherford, Radek Štěpán, Julie Sullivan, Matthew N. Wakeling, Xavier Watkins, Gos Micklem
1Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK

Tóm tắt

Abstract Summary: InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of ‘widgets’ performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. Availability: Freely available from http://www.intermine.org under the LGPL license. Contact:  [email protected] Supplementary information:  Supplementary data are available at Bioinformatics online.

Từ khóa


Tài liệu tham khảo

Balakrishnan, 2012, Yeastmine—an integrated data warehouse for saccharomyces cerevisiae data as a multipurpose tool-kit, Database, 2012, 10.1093/database/bar062

Celniker, 2009, Unlocking the secrets of the genome, Nature, 459, 927, 10.1038/459927a

Chen, 2011, TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery, PLoS One, 6, e17844, 10.1371/journal.pone.0017844

Contrino, 2012, modMine: flexible access to modENCODE data, Nucleic Acids Res., 40, D1082, 10.1093/nar/gkr921

Eilbeck, 2005, The Sequence Ontology: a tool for the unification of genome annotations, Genome Biol., 6, R44, 10.1186/gb-2005-6-5-r44

Goecks, 2010, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences, Genome Biol., 11, R86, 10.1186/gb-2010-11-8-r86

Lyne, 2007, FlyMine: an integrated database for Drosophila and Anopheles genomics, Genome Biol., 8, R129, 10.1186/gb-2007-8-7-r129

Pfreundt, 2010, FlyTF: improved annotation and enhanced functionality of the Drosophila transcription factor database, Nucleic Acids Res., 38, D443, 10.1093/nar/gkp910

Shimoyama, 2011, Rgd: a comparative genomics platform, Hum. Genomics, 5, 124, 10.1186/1479-7364-5-2-124

Smith, 2012, MitoMiner: a data warehouse for mitochondrial proteomics data, Nucleic Acids Res., 40, D1160, 10.1093/nar/gkr1101

Triplet, 2011, Systems biology warehousing: challenges and strategies toward effective data integration, DBKDA 2011, The Third International Conference on Advances in Databases, Knowledge, and Data Applications, 34