Swarm: robust and fast clustering method for amplicon-based studies
Tóm tắt
Từ khóa
Tài liệu tham khảo
Bittner, 2013, Diversity patterns of uncultured Haptophytes unravelled by pyrosequencing in Naples Bay, Molecular Ecology, 22, 87, 10.1111/mec.12108
Caporaso, 2010, QIIME allows analysis of high-throughput community sequencing data, Nature Methods, 7, 335, 10.1038/nmeth.f.303
Caporaso, 2011, Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample, Proceedings of the National Academy of Sciences of the United States of America, 108, 4516, 10.1073/pnas.1000080107
Dunthorn, 2014, Placing environmental next-generation sequencing amplicons from microbial eukaryotes into a phylogenetic context, Molecular Biology and Evolution, 31, 993, 10.1093/molbev/msu055
Edgar, 2010, Search and clustering orders of magnitude faster than BLAST, Bioinformatics, 26, 2460, 10.1093/bioinformatics/btq461
Fu, 2012, CD-HIT: accelerated for clustering the next-generation sequencing data, Bioinformatics, 28, 3150, 10.1093/bioinformatics/bts565
Ghodsi, 2011, DNACLUST: accurate and efficient clustering of phylogenetic marker genes, BMC Bioinformatics, 12, 271, 10.1186/1471-2105-12-271
Gotoh, 1982, An improved algorithm for matching biological sequences, Journal of Molecular Biology, 162, 705, 10.1016/0022-2836(82)90398-9
Huse, 2010, Ironing out the wrinkles in the rare biosphere through improved OTU clustering, Environmental Microbiology, 12, 1889, 10.1111/j.1462-2920.2010.02193.x
Karsenti, 2011, A holistic approach to marine eco-systems biology, PLoS Biology, 9, e1001177, 10.1371/journal.pbio.1001177
Koeppel, 2013, Surprisingly extensive mixed phylogenetic and ecological signals among bacterial Operational Taxonomic Units, Nucleic Acids Research, 41, 5175, 10.1093/nar/gkt241
Logares, 2014, Patterns of rare and abundant marine microbial eukaryotes, Current Biology, 24, 813, 10.1016/j.cub.2014.02.050
Masella, 2012, PANDAseq: paired-end assembler for illumina sequences, BMC Bioinformatics, 13, 31, 10.1186/1471-2105-13-31
Nebel, 2011, Delimiting operational taxonomic units for assessing ciliate environmental diversity using small-subunit rRNA gene sequences, Environmental Microbiology Reports, 3, 154, 10.1111/j.1758-2229.2010.00200.x
Needleman, 1970, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, 48, 443, 10.1016/0022-2836(70)90057-4
Rand, 1971, Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association, 66, 846, 10.1080/01621459.1971.10482356
R Development Core Team, 2014, R: a language and environment for statistical computing
Rognes, 2011, Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation, BMC Bioinformatics, 12, 221, 10.1186/1471-2105-12-221
Schloss, 2009, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities, Applied and Environmental Microbiology, 75, 7537, 10.1128/AEM.01541-09
Sellers, 1974, On the theory and computation of evolutionary distances, SIAM Journal on Applied Mathematics, 26, 787, 10.1137/0126070
Smith, 1981, Identification of common molecular subsequences, Journal of Molecular Biology, 147, 195, 10.1016/0022-2836(81)90087-5
Sogin, 2006, Microbial diversity in the deep sea and the underexplored “rare biosphere”, Proceedings of the National Academy of Sciences of the United States of America, 103, 12115, 10.1073/pnas.0605127103
Stackebrandt, 1994, Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology, International Journal of Systematic Bacteriology, 44, 846, 10.1099/00207713-44-4-846
Ukkonen, 1992, Approximate string-matching with q-grams and maximal matches, Theoretical Computer Science, 92, 191, 10.1016/0304-3975(92)90143-4