Bioconductor: open software development for computational biology and bioinformatics
Tóm tắt
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.
Tài liệu tham khảo
Bioconductor. [http://www.bioconductor.org]
GNU operating system - Free Software Foundation. [http://www.gnu.org]
Dafermos GN: Management and virtual decentralised networks: The Linux project. First Monday. 2001, 6 (11): [http://www.firstmonday.org/issues/issue6_11/dafermos/index.html]
Free Software Project Management HOWTO. [http://www.tldp.org/HOWTO/Software-Proj-Mgmt-HOWTO]
Torvalds L: The Linux edge. Comm Assoc Comput Machinery. 1999, 42: 38-39. 10.1145/299157.299165.
Raymond ES: The cathedral and the bazaar. First Monday. 1998, 3 (3): [http://www.firstmonday.org/issues/issue3_3/raymond/index.html]
R Development Core Team: R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. 2003
The R project for statistical computing. [http://www.R-project.org]
Spot home page. [http://spot.cmis.csiro.au/spot]
Wu H, Kerr MK, Cui X, Churchill GA: MAANOVA: a software package for the analysis of spotted cDNA microarray experiments. In The Analysis of Gene Expression Data: Methods and Software. Edited by: Parmigiani G, Garrett E, Irizarry R, Zeger S. 2003, New York: Springer-Verlag, 313-341.
Li C, Wong WH: Model based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci USA. 2001, 98: 31-36. 10.1073/pnas.011404098.
Chambers JM: Programming with Data: A Guide to the S Language. 1998, New York: Springer-Verlag
eXtensible markup language (XML). [http://www.w3.org/XML]
Box D, Ehnebuske D, Kakivaya G, Layman A, Mendelsohn N, Nielsen H, Thatte S, Winer D: Simple Object Access Protocol (SOAP) 1.1. [http://www.w3.org/TR/SOAP/]
Stein L: Creating a bioinformatics nation. Nature. 2002, 417: 119-120. 10.1038/417119a.
Message-Passing Interface (MPI). [http://www.mpi-forum.org]
Parallel Virtual Machine (PVM). [http://www.csm.ornl.gov/pvm/pvm_home.html]
Mascagni M, Ceperley DM, Srinivasan A: SPRNG: a scalable library for parallel pseudorandom number generation. In Monte Carlo and Quasi-Monte Carlo Methods 1998. Edited by: Niederreiter H, Spanier J. 2000, Berlin: Springer Verlag
Rossini AJ, Tierney L, Li M: Simple parallel statistical computing in R. University of Washington Biostatistics Technical Report #193. 2003, [http://www.bepress.com/uwbiostat/paper193]
Li M, Rossini AJ: RPVM: cluster statistical computing in R. RNews. 2001, 1: 4-7.
SmartEiffel - the GNU Eiffel compiler. [http://smarteiffel.loria.fr]
Distributed component object model (DCOM). [http://www.microsoft.com/com/tech/dcom.asp]
GraphViz. [http://www.graphviz.org]
Steele GL: Common LISP: The Language. 1990, London: Butterworth-Heinemann
Shalit A, Starbuck O, Moon D: Dylan Reference Manual. 1996, Boston, MA: Addison-Wesley
Leisch F: Sweave: dynamic generation of statistical reports using literate data analysis. In Compstat 2002 - Proceedings in Computational Statistics. Edited by: Härdle W, Rönz B. 2002, Heidelberg, Germany: Physika Verlag, 575-580.
Vignette screenshot. [http://www.bioconductor.org/Screenshots/vExplorer.jpg]
Purdy GN: CVS Pocket Reference. 2000, Sebastopol, CA: O'Reilly & Associates
Concurrent Versions System (CVS). [http://www.cvshome.org]
R Development Core Team: Writing R extensions. Vienna, Austria: R Foundation for Statistical Computing. 2003
Siek JG, Lee LQ, Lumsdaine A: The Boost Graph Library: User Guide and Reference Manual. 2001, Boston, MA: Addison-Wesley
BOOST. [http://www.boost.org]
Mei H, Tarczy-Hornoch P, Mork P, Rossini AJ, Shaker R, Donelson L: Expression array annotation using the BioMediator biological data integration system and the Bioconductor analytic platform. In Proceedings AMIA 2003. 2003, Bethesda, MD: American Medical Informatics Association
Raymond ES: Software Release Practice HOWTO. [http://tldp.org/HOWTO/Software-Release-Practice-HOWTO/index.html]
Buckheit J, Donoho DL: Wavelab and reproducible research. In Wavelets and Statistics. Edited by: Antoniadis A. 1995, New York:Springer-Verlag
Gentleman R, Temple Lang D: Statistical analyses and reproducible research. Bioconductor Project Working Paper #2. 2002, [http://www.bepress.com/bioconductor/paper2]
Rossini AJ, Leisch F: Literate statistical practice. University of Washington Biostatistics Technical Report #194. 2003, [http://www.bepress.com/uwbiostat/paper194]
Schwab M, Karrenbach M, Claerbout J: Making scientific computations reproducible. Technical Report, Stanford University. Stanford: Stanford Exploration Project. 1996
The Perl directory. [http://www.perl.org]
Python programming language. [http://www.python.org]
Zhang J, Carey V, Gentleman R: An extensible application for assembling annotation for genomic data. Bioinformatics. 2003, 19: 155-56. 10.1093/bioinformatics/19.1.155.
BioPerl. [http://BioPerl.org]
BioPython. [http://BioPython.org]
BioJava. [http://BioJava.org]
Stajich J, Block D, Boulez K, Brenner S, Chervitz S, Dagdigian C, Fuellen C, Gilbert J, Korf I, Lapp H, et al: The BioPerl toolkit: Perl modules for the life sciences. Genome Res. 2002, 12: 1611-1618. 10.1101/gr.361602.
The Omega project for statistical computing. [http://www.omegahat.org]
BioMOBY. [http://BioMOBY.org]
Chiaretti S, Li X, Gentleman R, Vitale A, Vignetti M, Mandelli F, Ritz J, Foa R: Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood. 2004, 103: 2771-2778. 10.1182/blood-2003-09-3243.