MSClust: A Multi-Seeds based Clustering algorithm for microbiome profiling using 16S rRNA sequence
Tài liệu tham khảo
Sharpton, 2011, PhylOTU: a high-throughput procedure quantifies microbial community diversity and resolves novel taxa from metagenomic data, PLoS Comput. Biol., 7, e1001061, 10.1371/journal.pcbi.1001061
Schloss, 2009, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities, Appl. Environ. Microbiol., 75, 7537, 10.1128/AEM.01541-09
Schloss, 2005, Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness, Appl. Environ. Microbiol., 71, 1501, 10.1128/AEM.71.3.1501-1506.2005
Huse, 2010, Ironing out the wrinkles in the rare biosphere through improved OTU clustering, Environ. Microbiol., 12, 1889, 10.1111/j.1462-2920.2010.02193.x
Sun, 2009, ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences, Nucleic Acids Res., 37, e76, 10.1093/nar/gkp285
Li, 2006, Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22, 1658, 10.1093/bioinformatics/btl158
Edgar, 2010, Search and clustering orders of magnitude faster than BLAST, Bioinformatics, 26, 2460, 10.1093/bioinformatics/btq461
Russell, 2010, A grammar-based distance metric enables fast and accurate clustering of large sets of 16S sequences, BMC Bioinforma., 11, 601, 10.1186/1471-2105-11-601
Ghodsi, 2011, DNACLUST: accurate and efficient clustering of phylogenetic marker genes, BMC Bioinforma., 12, 271, 10.1186/1471-2105-12-271
Hao, 2011, Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering, Bioinformatics, 27, 611, 10.1093/bioinformatics/btq725
Barriuso, 2011, Estimation of bacterial diversity using next generation sequencing of 16S rDNA: a comparison of different workflows, BMC Bioinforma., 12
Peng, 2010, SPICi: a fast clustering algorithm for large biological networks, Bioinformatics, 26, 1105, 10.1093/bioinformatics/btq078
Cai, 2011, ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time, Nucleic Acids Res., 39, e95, 10.1093/nar/gkr349
Lysholm, 2011, An efficient simulator of 454 data using configurable statistical models, BMC Res. Notes, 4, 449, 10.1186/1756-0500-4-449
Huse, 2007, Accuracy and quality of massively parallel DNA pyrosequencing, Genome Biol., 8, R143, 10.1186/gb-2007-8-7-r143
Xuan Vinh, 2010, Information theoretic measurement for clustering comparison: variants, properties, normalization and correction chance, J. Mach. Learn. Res., 11, 2837
Sun, 2012, A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis, Brief. Bioinform., 13, 107, 10.1093/bib/bbr009
Turnbaugh, 2009, A core gut microbiome in obese and lean twins, Nature, 457, 480, 10.1038/nature07540
Cole, 2009, The ribosomal database project: improved alignments and new tools for rRNA analysis, Nucleic Acid Res., 37, D141, 10.1093/nar/gkn879
Lempel, 1976, On the complexity of finite sequences, IEEE Trans. Inf. Theory, 22, 75, 10.1109/TIT.1976.1055501
Schloss, 2011, Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis, Appl. Environ. Microbiol., 77, 3219, 10.1128/AEM.02810-10
Sogin, 2006, Microbial diversity in the deep sea and the underexplored “rare biosphere”, Proc. Natl. Acad. Sci. U. S. A., 103, 12115, 10.1073/pnas.0605127103