TACO: Taxonomic prediction of unknown OTUs through OTU co‐abundance networks

Zohreh Baharvand Irannia1, Ting Chen1,2
1<!--1--> Program in Computational Biology and Bioinformatics Department of Biological Sciences University of Southern California Los Angeles CA 90089 USA
2<!--2--> Bioinformatics Division TNLIST Tsinghua University Beijing 100084 China

Tóm tắt

BackgroundA main goal of metagenomics is taxonomic characterization of microbial communities. Although sequence comparison has been the main method for the taxonomic classification, there is not a clear agreement on similarity calculation and similarity thresholds, especially at higher taxonomic levels such as phylum and class. Thus taxonomic classification of novel metagenomic sequences without close homologs in the biological databases poses a challenge.MethodsIn this study, we propose to use the co‐abundant associations between taxa/operational taxonomic units (OTU) across complex and diverse communities to assist taxonomic classification. We developed a Markov Random Field model to predict taxa of unknown microorganisms using co‐abundant associations.ResultsAlthough such associations are intrinsically functional associations, we demonstrate that they are strongly correlated with taxonomic associations and can be combined with sequence comparison methods to predict taxonomic origins of unknown microorganisms at phylum and class levels.ConclusionsWith the ever‐increasing accumulation of sequence data from microbial communities, we now take the first step to explore these associations for taxonomic identification beyond sequence similarity.Availability and ImplementationSource codes of TACO are freely available at the following URL: https://github.com/baharvand/OTU‐Taxonomy‐Identification implemented in C++, supported on Linux and MS Windows.

Từ khóa


Tài liệu tham khảo

10.1128/MMBR.68.4.669‐685.2004

10.1038/nature08821

10.1128/mr.59.1.143-169.1995

10.1371/journal.pbio.0050082

10.1111/j.1574-6968.1989.tb03486.x

10.1146/annurev.genet.38.072902.091216

10.1007/s11390‐010‐9306‐4

10.1186/2042‐5783‐2‐3

10.1093/bib/bbs039

10.1093/bib/bbs054

10.1093/nar/24.1.82

10.1093/nar/gks1219

10.1128/AEM.03006‐05

10.1016/S0022‐2836(05)80360‐2

10.1186/1471‐2105‐9‐386

10.1101/gr.5969107

10.1371/journal.pcbi.1000844

10.1038/nrmicro3330

10.1093/nar/gkq118

10.1101/gr.104521.109

10.1038/ismej.2011.119

10.1038/nrmicro2832

10.1038/ismej.2011.24

10.1038/ismej.2011.107

10.1090/conm/001

10.1089/106652703322756168

Human‐Intestine‐NCBI http://www.ncbi.nlm.nih.gov/bioproject/204926

Human‐SkinNCBI http://www.ncbi.nlm.nih.gov/bioproject/PRJEB3280

Soil‐NCBI http://www.ncbi.nlm.nih.gov/bioproject/PRJEB4349

10.1093/bioinformatics/btq725

10.1371/journal.pone.0032491

10.1073/pnas.0601602103

10.1126/science.1065103