Phân tích dữ liệu bền vững với Snakemake
Tóm tắt
Từ khóa
Tài liệu tham khảo
M Baker, 2016, 1,500 scientists lift the lid on reproducibility., Nature., 533, 452-4, 10.1038/533452a
J Mesirov, 2010, Computer science. Accessible reproducible research., Science., 327, 415-6, 10.1126/science.1179653
M Munafò, 2017, A manifesto for reproducible science., Nat Hum Behav., 1, 0021, 10.1038/s41562-016-0021
E Afgan, 2018, The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update., Nucleic Acids Res., 46, W537-W544, 10.1093/nar/gky379
M Berthold, 2007, KNIME: The Konstanz Information Miner.
M Kluge, 2020, Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution., GigaScience., 9, giaa068, 10.1093/gigascience/giaa068
A Cervera, 2019, Anduril 2: upgraded large–scale data integration framework., Bioinformatics., 35, 3815-3817, 10.1093/bioinformatics/btz133
M Salim, 2018, Balsam: Automated Scheduling and Execution of Dynamic, Data-Intensive HPC Workflows., In: Proceedings of the 8th Workshop on Python for High-Performance and Scientific Computing. ACM Press.
V Cima, 2018, HyperLoom: A Platform for Defining and Executing Scientific Pipelines in Distributed Environments., ACM., 1-6, 10.1145/3183767.3183768
L Coelho, 2017, Jug: Software for Parallel Reproducible Computation in Python., J Open Res Softw., 5, 30, 10.5334/jors.161
M Tanaka, 2010, Pwrake: a parallel and distributed flexible workflow management tool for wide-area data intensive computing., Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing -HPDC 2010., 356-359, 10.1145/1851476.1851529
L Goodstadt, 2010, Ruffus: a lightweight Python library for computational pipelines., Bioinformatics., 26, 2778-9, 10.1093/bioinformatics/btq524
S Lampa, 2019, SciPipe: A workflow library for agile development of complex and dynamic bioinformatics pipelines., Gigascience., 8, 10.1093/gigascience/giz044
Y Hold-Geoffroy, 2014, Once you SCOOP, no need to fork, Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment., 1-8, 10.1145/2616498.2616565
F Lordan, 2013, ServiceSs: An Interoperable Programming Framework for the Cloud., J Grid Comput., 12, 67-91, 10.1007/s10723-013-9272-5
S Pal, 2020, Bioinformatics pipeline using JUDI: Just Do It!, Bioinformatics., 36, 2572-2574, 10.1093/bioinformatics/btz956
P Di Tommaso, 2017, Nextflow enables reproducible computational workflows., Nat Biotechnol., 35, 316-319, 10.1038/nbt.3820
J Köster, 2012, Snakemake–a scalable bioinformatics workflow engine., Bioinformatics., 28, 2520, 10.1093/bioinformatics/bts480
L Yao, 2017, BioQueue: a novel pipeline framework to accelerate bioinformatics analysis., Bioinformatics., 33, 3286-3288, 10.1093/bioinformatics/btx403
S Sadedin, 2012, Bpipe: a tool for running and managing bioinformatics pipelines., Bioinformatics., 28, 1525-6, 10.1093/bioinformatics/bts167
P Ewels, 2016, Cluster Flow: A user-friendly bioinformatics workflow tool [version 1; peer review: 3 approved]., F1000Res., 5, 2824, 10.12688/f1000research.10335.1
H Oliver, 2018, Cylc: A Workflow Engine for Cycling Systems., J Open Source Softw., 3, 737, 10.21105/joss.00737
P Cingolani, 2015, BigDataScript: a scripting language for data pipelines., Bioinformatics., 31, 10-16, 10.1093/bioinformatics/btu595
I Jimenez, 2017, The Popper Convention: Making Reproducible Systems Evaluation Practical, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)., 10.1109/IPDPSW.2017.157
C Evans, 2009, YAML Ain’t Markup Language YAML Version 1.2.
K Voss, 2017, Full-stack genomics pipelining with GATK4 +WDL +Cromwell., F1000Res., 6, 10.7490/f1000research.1114634.1
J Vivian, 2017, Toil enables reproducible open source, big biomedical data analyses., Nat Biotechnol., 35, 314-316, 10.1038/nbt.3772
S Lee, 2019, Tibanna: software for scalable execution of portable pipelines on the cloud., Bioinformatics., 35, 4424-4426, 10.1093/bioinformatics/btz379
G Kurtzer, 2017, Singularity: Scientific containers for mobility of compute., PLoS One., 12, e0177459, 10.1371/journal.pone.0177459
D Huizinga, 2007, Automated Defect Prevention: Best Practices in Software Management, 10.1002/9780470165171
J Chall, 1995, Readability revisited: the new Dale-Chall readability formula.
L Sundkvist, 2017, Code Styling and its Effects on Code Readability and Interpretation
B Grüning, 2018, Practical Computational Reproducibility in the Life Sciences., Cell Syst., 6, 631-635, 10.1016/j.cels.2018.03.014
J Köster,, 2020, Data analysis for paper "Sustainable data analysis with Snakemake"., Zenodo.
H Handschuh, 2005, SHA Family (Secure Hash Algorithm)., Encyclopedia of Cryptography and Security. Springer US., 565-567, 10.1007/0-387-23483-7_388
A Narayanan, 2016, Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction.
W McKinney, 2010, Data Structures for Statistical Computing in Python., Proceedings of the 9th Python in Science Conference., 56-61, 10.25080/Majora-92bf1922-00a