Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution

ISME Journal - Tập 9 Số 1 - Trang 68-80 - 2015
Mikhail Tikhonov1,2, Robert Leach2, Ned S. Wingreen3,2
1Joseph Henry Laboratories of Physics, Princeton University , Princeton, NJ, USA
2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
3Department of Molecular Biology, Princeton University, Princeton, NJ, USA

Tóm tắt

Abstract The standard approach to analyzing 16S tag sequence data, which relies on clustering reads by sequence similarity into Operational Taxonomic Units (OTUs), underexploits the accuracy of modern sequencing technology. We present a clustering-free approach to multi-sample Illumina data sets that can identify independent bacterial subpopulations regardless of the similarity of their 16S tag sequences. Using published data from a longitudinal time-series study of human tongue microbiota, we are able to resolve within standard 97% similarity OTUs up to 20 distinct subpopulations, all ecologically distinct but with 16S tags differing by as little as one nucleotide (99.2% similarity). A comparative analysis of oral communities of two cohabiting individuals reveals that most such subpopulations are shared between the two communities at 100% sequence identity, and that dynamical similarity between subpopulations in one host is strongly predictive of dynamical similarity between the same subpopulations in the other host. Our method can also be applied to samples collected in cross-sectional studies and can be used with the 454 sequencing platform. We discuss how the sub-OTU resolution of our approach can provide new insight into factors shaping community assembly.

Từ khóa

Tài liệu tham khảo

