Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution

ISME Journal - Tập 9 Số 1 - Trang 68-80 - 2015
Mikhail Tikhonov1,2, Robert Leach2, Ned S. Wingreen3,2
1Joseph Henry Laboratories of Physics, Princeton University , Princeton, NJ, USA
2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
3Department of Molecular Biology, Princeton University, Princeton, NJ, USA

Tóm tắt

Abstract The standard approach to analyzing 16S tag sequence data, which relies on clustering reads by sequence similarity into Operational Taxonomic Units (OTUs), underexploits the accuracy of modern sequencing technology. We present a clustering-free approach to multi-sample Illumina data sets that can identify independent bacterial subpopulations regardless of the similarity of their 16S tag sequences. Using published data from a longitudinal time-series study of human tongue microbiota, we are able to resolve within standard 97% similarity OTUs up to 20 distinct subpopulations, all ecologically distinct but with 16S tags differing by as little as one nucleotide (99.2% similarity). A comparative analysis of oral communities of two cohabiting individuals reveals that most such subpopulations are shared between the two communities at 100% sequence identity, and that dynamical similarity between subpopulations in one host is strongly predictive of dynamical similarity between the same subpopulations in the other host. Our method can also be applied to samples collected in cross-sectional studies and can be used with the 454 sequencing platform. We discuss how the sub-OTU resolution of our approach can provide new insight into factors shaping community assembly.

Từ khóa


Tài liệu tham khảo

Brestoff, 2013, Commensal bacteria at the interface of host metabolism and the immune system, Nat Immunol, 14, 676, 10.1038/ni.2640

Caporaso, 2011, Moving pictures of the human microbiome, Genome Biol, 12, R50, 10.1186/gb-2011-12-5-r50

Caporaso, 2012, Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms, ISME J, 6, 1621, 10.1038/ismej.2012.8

Costello, 2009, Bacterial community variation in human body habitats across space and time, Science, 326, 1694, 10.1126/science.1177486

Costello, 2012, The application of ecological theory toward an understanding of the human microbiome, Science, 336, 1255, 10.1126/science.1224203

Edgar, 2010, Search and clustering orders of magnitude faster than BLAST, Bioinformatics, 26, 2460, 10.1093/bioinformatics/btq461

Edgar, 2011, UCHIME improves sensitivity and speed of chimera detection, Bioinformatics, 27, 2194, 10.1093/bioinformatics/btr381

Edgar, 2013, UPARSE: highly accurate OTU sequences from microbial amplicon reads, Nat Methods, 10, 996, 10.1038/nmeth.2604

Eren, 2013, Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data, Methods Ecol Evol, 4, 1111, 10.1111/2041-210X.12114

Faith, 2013, The long-term stability of the human gut microbiota, Science, 341, 1237439, 10.1126/science.1237439

Fierer, 2011, The generation and maintenance of diversity in microbial communities, Am J Bot, 98, 439, 10.3732/ajb.1000498

Fredricks, 2013, The Human Microbiota: How Microbial Communities Affect Health and Disease, 10.1002/9781118409855

Haas, 2011, Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons, Genome Res, 3, 494, 10.1101/gr.112730.110

Hamady, 2009, Microbial community profiling for human microbiome projects: tools, techniques, and challenges, Genome Res, 19, 1141, 10.1101/gr.085464.108

Huang, 2010, CD-HIT Suite: a web server for clustering and comparing biological sequences, Bioinformatics, 26, 680, 10.1093/bioinformatics/btq003

Hunt, 2008, Resource partitioning and sympatric differentiation among closely related bacterioplankton, Science, 320, 1081, 10.1126/science.1157890

Huse, 2010, Ironing out the wrinkles in the rare biosphere through improved OTU clustering, Environ Microbiol, 12, 1889, 10.1111/j.1462-2920.2010.02193.x

Huttenhower, 2012, Structure, function and diversity of the healthy human microbiome, Nature, 486, 207, 10.1038/nature11234

Kamada, 2013, Control of pathogens and pathobionts by the gut microbiota, Nat Immunol, 14, 685, 10.1038/ni.2608

Klindworth, 2013, Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies, Nucleic Acids Res, 41, e1, 10.1093/nar/gks808

Kunin, 2010, Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates, Environ Microbiol, 12, 118, 10.1111/j.1462-2920.2009.02051.x

Langille, 2013, Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences, Nature Biotechnol, 31, 814, 10.1038/nbt.2676

Lozupone, 2005, UniFrac: a new phylogenetic method for comparing microbial communities, Appl Environ Microbiol, 71, 8228, 10.1128/AEM.71.12.8228-8235.2005

Lukjancenko, 2010, Comparison of 61 sequenced Escherichia coli genomes, Microbial Ecol, 60, 708, 10.1007/s00248-010-9717-3

Morgan, 2013, Improved inference of taxonomic richness from environmental DNA, PLOS One, 8, e71974, 10.1371/journal.pone.0071974

Ochman, 2003, Neutral mutations and neutral substitutions in bacterial genomes, Mol Biol Evol, 20, 2091, 10.1093/molbev/msg229

Preheim, 2013, Distribution-based clustering: using ecology to refine the operational taxonomic unit, Appl Environ Microbiol, 79, 6593, 10.1128/AEM.00342-13

Prosser, 2007, Essay—the role of ecological theory in Microbial Ecol, Nat Rev Microbiol, 5, 384, 10.1038/nrmicro1643

Quince, 2009, Accurate determination of microbial diversity from 454 pyrosequencing data, Nat Methods, 6, 639, 10.1038/nmeth.1361

Quince, 2011, Removing noise from pyrosequenced amplicons, BMC Bioinformatics, 12, 38, 10.1186/1471-2105-12-38

Rosen, 2012, Denoising PCR-amplified metagenome data, BMC Bioinformatics, 13, 283, 10.1186/1471-2105-13-283

Schloss, 2009, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities, Appl Environ Microbiol, 75, 7537, 10.1128/AEM.01541-09

Schloss, 2011, Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies, PLoS One, 6, e27310, 10.1371/journal.pone.0027310

Schloss, 2011, Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis, Appl Environ Microbiol, 77, 3219, 10.1128/AEM.02810-10

Shade, 2012, Fundamentals of microbial community resistance and resilience, Front Microbiol, 3, 417, 10.3389/fmicb.2012.00417

Shade, 2013, A meta-analysis of changes in bacterial and archaeal communities with time, ISME J, 7, 1493, 10.1038/ismej.2013.54

Song, 2013, Cohabiting family members share microbiota with one another and with their dogs, Elife, 2, e00458, 10.7554/eLife.00458

Sul, 2011, Bacterial community comparisons by taxonomy-supervised analysis independent of sequence alignment and clustering, Proc Natl Acad Sci USA, 108, 14637, 10.1073/pnas.1111435108

Tourova, 2003, Copy number of ribosomal operons in prokaryotes and its effect on phylogenetic analyses, Microbiology, 72, 389, 10.1023/A:1025045919260

Turnbaugh, 2010, Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins, Proc Natl Acad Sci USA, 107, 7503, 10.1073/pnas.1002355107

VandeWalle, 2012, Acinetobacter, Aeromonas and Trichococcus populations dominate the microbial community within urban sewer infrastructure, Environ Microbiol, 14, 2538, 10.1111/j.1462-2920.2012.02757.x

Youngblut, 2013, Lineage-specific responses of microbial communities to environmental change, Appl Environ Microbiol, 79, 39, 10.1128/AEM.02226-12

Zheng, 2012, DySC: software for greedy clustering of 16S rRNA reads, Bioinformatics, 28, 2182, 10.1093/bioinformatics/bts355