Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Một hệ thống phân loại Greengenes cải tiến với các cấp bậc rõ ràng cho các phân tích sinh thái và tiến hóa của vi khuẩn và archaea

ISME Journal - Tập 6 Số 3 - Trang 610-618 - 2012

Daniel McDonald¹, Morgan N. Price², Julia K. Goodrich¹, Eric P. Nawrocki³, Todd Z. DeSantis⁴, Alexander J. Probst⁵, Gary L. Andersen⁵, Rob Knight^1,6, Philip Hugenholtz⁷

¹Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado , Boulder, CO , USA

²Lawrence Berkeley National Laboratory, Physical Biosciences Division , Berkeley, CA , USA

³Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA

⁴Department of Bioinformatics, Second Genome Inc. , San Bruno, CA , USA

⁵Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology , Berkeley, CA , USA

⁶Howard Hughes Medical Institute, Boulder, CO, USA

⁷Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences and Institute for Molecular Bioscience , St Lucia, Queensland , Australia

Tóm tắt

Tóm tắt Các hệ thống phân loại tham chiếu là rất quan trọng để cung cấp một khung phân loại cho việc giải thích các khảo sát gene đánh dấu và metagenomic, vốn đang tiếp tục phát hiện ra các loài mới với tốc độ đáng kể. Greengenes là một cơ sở dữ liệu gene 16S rRNA toàn bộ chuyên dụng, cung cấp cho người dùng một hệ thống phân loại được chỉnh sửa dựa trên việc suy diễn cây kiểu de novo. Chúng tôi đã phát triển một phương pháp 'phân loại thành cây' để chuyển giao các tên nhóm từ một hệ thống phân loại hiện có sang một hình thái cây, và đã sử dụng nó để áp dụng các hệ thống phân loại Greengenes, Trung tâm Thông tin Công nghệ Sinh học Quốc gia (NCBI) và cyanoDB (chỉ dành cho vi khuẩn lam) vào một cây de novo bao gồm 408.315 chuỗi. Chúng tôi cũng đã bao gồm thông tin cấp bậc rõ ràng do hệ thống phân loại NCBI cung cấp cho các tên nhóm (bằng cách tiền tố các chỉ định cấp bậc) nhằm cải thiện định hướng người dùng và tính nhất quán trong phân loại. Hệ thống phân loại hợp nhất mà chúng tôi tạo ra đã cải thiện phân loại của 75% các chuỗi theo một hoặc nhiều cấp bậc so với hệ thống phân loại NCBI gốc, với những cải tiến rõ ràng nhất xảy ra ở các chuỗi môi trường bị phân loại thấp. Chúng tôi cũng đã đánh giá các bộ phận (nhóm) ứng cử viên hiện được xác định bởi NCBI và trình bày các khuyến nghị để hợp nhất 34 nhóm có tên gọi trùng lặp. Tất cả các kết quả trung gian từ quy trình, bao gồm suy diễn cây, jackknifing và chuyển giao một hệ thống phân loại cho một cây nhận (tax2tree) đều có sẵn để tải xuống. Hệ thống phân loại Greengenes được cải thiện này nên cung cấp cơ sở hạ tầng quan trọng cho nhiều dự án megasequencing nghiên cứu các hệ sinh thái trên các quy mô từ cơ thể của chúng ta (Dự án Vi sinh vật Người) đến toàn bộ hành tinh (Dự án Vi sinh vật Địa cầu). Việc triển khai phần mềm có thể được lấy từ http://sourceforge.net/projects/tax2tree/.

Từ khóa

Tài liệu tham khảo

Cannone, 2002, The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs, BMC Bioinform, 3, 2, 10.1186/1471-2105-3-2

Caporaso, 2010, PyNAST: a flexible tool for aligning sequences to a template alignment, Bioinformatics, 26, 266, 10.1093/bioinformatics/btp636

Chun, 2007, EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences, Int J Syst Evol Microbiol, 57, 2259, 10.1099/ijs.0.64915-0

Ciccarelli, 2006, Toward automatic reconstruction of a highly resolved tree of life, Science, 311, 1283, 10.1126/science.1123061

Cole, 2009, The Ribosomal Database Project: improved alignments and new tools for rRNA analysis, Nucleic Acids Res, 37, D141, 10.1093/nar/gkn879

Dalevi, 2007, Automated group assignment in large phylogenetic trees using GRUNT: GRouping, Ungrouping, Naming Tool, BMC Bioinform, 8, 402, 10.1186/1471-2105-8-402

DeSantis, 2006, Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB, Appl Environ Microbiol, 72, 5069, 10.1128/AEM.03006-05

Dojka, 1998, Microbial diversity in a hydrocarbon- and chlorinated-solvent-contaminated aquifer undergoing intrinsic bioremediation, Appl Environ Microbiol, 64, 3869, 10.1128/AEM.64.10.3869-3877.1998

Haas, 2011, Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons, Genome Res, 21, 494, 10.1101/gr.112730.110

Hugenholtz, 1998, Novel division level bacterial diversity in a Yellowstone hot spring, J Bacteriol, 180, 366, 10.1128/JB.180.2.366-376.1998

Kelly, 2001, Phylogenetic analysis of the succession of bacterial communities in the Great South Bay (Long Island), FEMS Microbiol Ecol, 35, 85, 10.1111/j.1574-6941.2001.tb00791.x

Knight, 2007, PyCogent: a toolkit for making sense from sequence, Genome Biol, 8, R171, 10.1186/gb-2007-8-8-r171

Lane, 1991, Nucleic Acid Techniques in Bacterial Systematics

Ley, 2006, Unexpected diversity and complexity of the Guerrero Negro hypersaline microbial mat, Appl Environ Microbiol, 72, 3685, 10.1128/AEM.72.5.3685-3695.2006

Liu, 2008, Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers, Nucleic Acids Res, 36, e120, 10.1093/nar/gkn491

Ludwig, 2001, Bergey's Manual of Systematic Bacteriology

Ludwig, 2004, ARB: a software environment for sequence data, Nucleic Acids Res, 32, 1363, 10.1093/nar/gkh293

Mavromatis, 2009, Genome analysis of the anaerobic thermohalophilic bacterium Halothermothrix orenii, PloS One, 4, e4192, 10.1371/journal.pone.0004192

Nawrocki, 2009, Infernal 1.0: inference of RNA alignments, Bioinformatics, 25, 1335, 10.1093/bioinformatics/btp157

Peplies, 2008, A standard operating procedure for phylogenetic inference (SOPPI) using (rRNA) marker genes, Syst Appl Microbiol, 31, 251, 10.1016/j.syapm.2008.08.003

Peterson, 2009, The NIH Human Microbiome Project, Genome Res, 19, 2317, 10.1101/gr.096651.109

Price, 2010, FastTree 2--approximately maximum-likelihood trees for large alignments, PloS One, 5, e9490, 10.1371/journal.pone.0009490

Pruesse, 2007, SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB, Nucleic Acids Res, 35, 7188, 10.1093/nar/gkm864

Sayers, 2011, Database resources of the National Center for Biotechnology Information, Nucleic Acids Res, 39, D38, 10.1093/nar/gkq1172

Tringe, 2008, A renaissance for the pioneering 16S rRNA gene, Curr Opin Microbiol, 11, 442, 10.1016/j.mib.2008.09.011

Turnbaugh, 2007, The human microbiome project, Nature, 449, 804, 10.1038/nature06244

van Rijsbergen, 1979, Information Retrieval, 2nd

Vogel, 2009, TerraGenome: a consortium for the sequencing of a soil metagenome, Nat Rev Micro, 7, 252, 10.1038/nrmicro2119

Wang, 2007, Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy, Appl Environ Microbiol, 73, 5261, 10.1128/AEM.00062-07

Werner, 2011, Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys, The ISME Journal, 6, 94, 10.1038/ismej.2011.82

Wu, 2009, A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea, Nature, 462, 1056, 10.1038/nature08656

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA