A powerful microbiome-based association test and a microbial taxa discovery framework for comprehensive association mapping

Microbiome - Tập 5 Số 1 - Trang 1-15 - 2017
Koh, Hyunwook1, Blaser, Martin J.2, Li, Huilin1
1Department of Population Health and Environmental Medicine, New York University School of Medicine, New York, USA
2Department of Medicine and Microbiology, New York University Langone Medical Center, New York, USA

Tóm tắt

The role of the microbiota in human health and disease has been increasingly studied, gathering momentum through the use of high-throughput technologies. Further identification of the roles of specific microbes is necessary to better understand the mechanisms involved in diseases related to microbiome perturbations. Here, we introduce a new microbiome-based group association testing method, optimal microbiome-based association test (OMiAT). OMiAT is a data-driven testing method which takes an optimal test throughout different tests from the sum of powered score tests (SPU) and microbiome regression-based kernel association test (MiRKAT). We illustrate that OMiAT efficiently discovers significant association signals arising from varying microbial abundances and different relative contributions from microbial abundance and phylogenetic information. We also propose a way to apply it to fine-mapping of diverse upper-level taxa at different taxonomic ranks (e.g., phylum, class, order, family, and genus), as well as the entire microbial community, within a newly introduced microbial taxa discovery framework, microbiome comprehensive association mapping (MiCAM). Our extensive simulations demonstrate that OMiAT is highly robust and powerful compared with other existing methods, while correctly controlling type I error rates. Our real data analyses also confirm that MiCAM is especially efficient for the assessment of upper-level taxa by integrating OMiAT as a group analytic method. OMiAT is attractive in practice due to the high complexity of microbiome data and the unknown true nature of the state. MiCAM also provides a hierarchical association map for numerous microbial taxa and can also be used as a guideline for further investigation on the roles of discovered taxa in human health and disease.

Tài liệu tham khảo

citation_journal_title=Nature; citation_title=A framework for human microbiome research; citation_author=; citation_volume=486; citation_issue=7402; citation_publication_date=2012; citation_pages=215-221; citation_doi=10.1038/nature11209; citation_id=CR1 citation_journal_title=Cell; citation_title=The impact of the gut microbiota on human health: an integrative view; citation_author=JC Clemente, LK Ursell, LW Parfrey, R Knight; citation_volume=148; citation_issue=6; citation_publication_date=2012; citation_pages=1258-1270; citation_doi=10.1016/j.cell.2012.01.035; citation_id=CR2 citation_journal_title=Nature; citation_title=Antibiotics in early life alter the murine colonic microbiome and adiposity; citation_author=I Cho, S Yamanishi, L Cox, BA Methé, J Zavadil, K Li, Z Gao, D Mahana, K Raju, I Teitler, H Li, AV Alekseyenko, MJ Blaser; citation_volume=488; citation_publication_date=2012; citation_pages=621-626; citation_doi=10.1038/nature11400; citation_id=CR3 citation_journal_title=Nature; citation_title=A metagenome-wide association study of gut microbiota in type 2 diabetes; citation_author=J Qin, Y Li, Z Cai, S Li, J Zhu, F Zhang, S Liang, W Zhang, Y Guan, D Shen; citation_volume=490; citation_publication_date=2012; citation_pages=55-60; citation_doi=10.1038/nature11450; citation_id=CR4 citation_journal_title=Nat Methods; citation_title=Intestinal microbiota metabolism of L-carnitine, a nutrient in red meat, promotes atherosclerosis; citation_author=RA Koeth, Z Wang, BS Levison, JA Buffa, E Org, BT Sheehy, EB Britt, X Fu, Y Wu, L Li; citation_volume=19; citation_issue=5; citation_publication_date=2013; citation_pages=576-585; citation_id=CR5 citation_journal_title=Cell; citation_title=Altering the intestinal microbiota during a critical developmental window has lasting metabolic consequences; citation_author=LM Cox, S Yamanishi, J Sohn, AV Alekseyenko, JM Leung, I Cho, SG Kim, H Li, Z Gao, D Mahana; citation_volume=158; citation_issue=4; citation_publication_date=2013; citation_pages=705-721; citation_doi=10.1016/j.cell.2014.05.052; citation_id=CR6 citation_journal_title=Sci Transl Med; citation_title=Antibiotics, birth mode, and diet shape microbiome maturation during early life; citation_author=NA Bokulich, J Chung, T Battagila, N Henderson, M Jay, H Li, A D Lieber, C Wu, GI Perez-Perez, Y Chen, W Schweizer, X Zheng, M Contreras, MG Dominguez-Bello, MJ Blaser; citation_volume=8; citation_issue=343; citation_publication_date=2016; citation_pages=343-382; citation_doi=10.1126/scitranslmed.aad7121; citation_id=CR7 citation_journal_title=Genome Med; citation_title=Antibiotic perturbation of the murine gut microbiome enhances the adiposity, insulin resistance, and liver disease associated with high-fat diet; citation_author=D Mahana, CM Trent, ZD Kurtz, NA Bokulich, T Battaglia, J Chung, CL Müller, H Li, RA Bonneau, MJ Blaser; citation_volume=8; citation_issue=1; citation_publication_date=2016; citation_pages=48; citation_doi=10.1186/s13073-016-0297-9; citation_id=CR8 Wu J, Peters BA, Dominianni C, Zhang Y, Pei Z, Yang L, Ma Y, Purdue MP, Jacobs EJ, Gapstur SM, Li H, Alekseyenko AV, Hayes RB, Ahn J. Cigarette smoking and the oral microbiome in a large study of American adults. ISME J. 2016;10(10):2435-46. doi: 10.1038/ismej.2016.37 . citation_journal_title=Nature; citation_title=Conservation of primary structure in 16S ribosomal RNA; citation_author=CR Woese, GE Fox, L Zablen, T Uchida, L Bonen, K Pechman, BJ Lewis, D Stahl; citation_volume=254; citation_publication_date=1975; citation_pages=83-85; citation_doi=10.1038/254083a0; citation_id=CR10 citation_journal_title=Nat Rev Genet; citation_title=Metagenomics: DNA sequencing of environmental samples; citation_author=SG Tringe, EM Rubin; citation_volume=6; citation_issue=11; citation_publication_date=2005; citation_pages=805-814; citation_doi=10.1038/nrg1709; citation_id=CR11 citation_journal_title=Microb Inform Exp; citation_title=Metagenomics—a guide from sampling to data analysis; citation_author=T Thomas, J Gilbert, F Meyer; citation_volume=2; citation_publication_date=2012; citation_pages=3; citation_doi=10.1186/2042-5783-2-3; citation_id=CR12 citation_journal_title=Nat Methods; citation_title=QIIME allows analysis of high-throughput community sequencing data; citation_author=JG Caporaso, J Kuczynski, J Stombaugh, K Bittinger, FD Bushman, EK Costello, N Fierer, AG Peña, JK Goodrich, JI Gordon; citation_volume=7; citation_publication_date=2010; citation_pages=335-336; citation_doi=10.1038/nmeth.f.303; citation_id=CR13 citation_journal_title=Nature; citation_title=An obesity-associated gut microbiome with increased capacity for energy harvest; citation_author=PJ Turnbaugh, RE Ley, MA Mahowald, V Magrini, ER Mardis, JI Gordon; citation_volume=444; citation_publication_date=2016; citation_pages=1027-1031; citation_doi=10.1038/nature05414; citation_id=CR14 citation_journal_title=PLoS One; citation_title=Substantial alterations of the curaneous bacterial biota in psoriatic lesions; citation_author=Z Gao, C Tseng, BE Strober, Z Pei, MJ Blaser; citation_volume=3; citation_issue=7; citation_publication_date=2008; citation_pages=e2719; citation_doi=10.1371/journal.pone.0002719; citation_id=CR15 citation_journal_title=Appl Environ Microbiol; citation_title=Introducing mothur: open-source, platform independent, community-supported software for describing and comparing microbial communities; citation_author=PD Schloss, SL Westcott, T Ryabin, JR Hall, M Hartmann, EB Hollister, RA Lesniewski, BB Oakely, DH Parks, CJ Robinson; citation_volume=75; citation_issue=23; citation_publication_date=2009; citation_pages=7537-7541; citation_doi=10.1128/AEM.01541-09; citation_id=CR16 citation_journal_title=BMC Bioinformatics; citation_title=pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree; citation_author=FA Matsen, RB Kodner, EV Armbrust; citation_volume=11; citation_publication_date=2010; citation_pages=538; citation_doi=10.1186/1471-2105-11-538; citation_id=CR17 citation_journal_title=Gut; citation_title=Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach; citation_author=C Manichanh, L Rigottier-Gois, E Bonnaud, K Gloux, E Pelletier, L Frangeul, R Nalin, C Jarrin, P Chardon, P Marteau, J Roca, J Dore; citation_volume=55; citation_issue=2; citation_publication_date=2006; citation_pages=205-211; citation_doi=10.1136/gut.2005.073817; citation_id=CR18 citation_journal_title=Appl Environ Microbiol; citation_title=Diversity of human vaginal bacterial communities and associations with clinically defined bacterial vaginosis; citation_author=BB Oakley, TL Fiedler, JM Marrazzo, DN Fredricks; citation_volume=74; citation_issue=15; citation_publication_date=2008; citation_pages=4898-4909; citation_doi=10.1128/AEM.02884-07; citation_id=CR19 citation_journal_title=Annu Rev Stat Appl; citation_title=Microbiome, metagenomics, and high-dimensional compositional data analysis; citation_author=H Li; citation_volume=2; citation_publication_date=2015; citation_pages=73-94; citation_doi=10.1146/annurev-statistics-010814-020351; citation_id=CR20 citation_journal_title=Genome Biol; citation_title=Metagenomic biomarker discovery and explanation; citation_author=N Segata, J Izard, L Waldron, D Gevers, L Miropolsky, WS Garrett, C Huttenhower; citation_volume=12; citation_publication_date=2011; citation_pages=R60; citation_doi=10.1186/gb-2011-12-6-r60; citation_id=CR21 citation_journal_title=Bioinformatics; citation_title=STAMP: statistical analysis of taxonomic and functional profiles; citation_author=DH Parks, GW Tyson, P Hugenholts, RG Beiko; citation_volume=30; citation_issue=21; citation_publication_date=2014; citation_pages=3123-3124; citation_doi=10.1093/bioinformatics/btu494; citation_id=CR22 citation_journal_title=Genome Biol; citation_title=Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2; citation_author=MI Love, W Huber, S Anders; citation_volume=15; citation_issue=12; citation_publication_date=2014; citation_pages=1-21; citation_doi=10.1186/s13059-014-0550-8; citation_id=CR23 citation_journal_title=Nat Methods; citation_title=Differential abundance analysis for microbial marker-gene surveys; citation_author=JN Paulson, OC Stine, M Pop; citation_volume=10; citation_issue=12; citation_publication_date=2013; citation_pages=1200-1202; citation_doi=10.1038/nmeth.2658; citation_id=CR24 citation_journal_title=Am J Hum Genet; citation_title=Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test; citation_author=N Zhao, J Chen, IM Carroll, T Ringel-Kulka, MP Epstein, H Zhou, JJ Zhou, Y Ringel, H Li, MC Wu; citation_volume=96; citation_issue=5; citation_publication_date=2015; citation_pages=797-807; citation_doi=10.1016/j.ajhg.2015.04.003; citation_id=CR25 Wu C, Chen J, Kim J, Pan W. An adaptive association test for microbiome data. Genome Med. 2016;8(1):56. doi: 10.1186/s13073-016-0302-3 . citation_journal_title=Appl Environ Microbiol; citation_title=UniFrac: a new phylogenetic method for comparing microbial communities; citation_author=CA Lozupone, R Knight; citation_volume=71; citation_issue=12; citation_publication_date=2005; citation_pages=8228-8235; citation_doi=10.1128/AEM.71.12.8228-8235.2005; citation_id=CR27 citation_journal_title=Appl Environ Microbiol; citation_title=Quantitative and qualitative β diversity measures lead to different insights into factors that structure microbial communities; citation_author=CA Lozupone, M Hamady, ST Kelley, R Knight; citation_volume=73; citation_issue=5; citation_publication_date=2007; citation_pages=1576-1585; citation_doi=10.1128/AEM.01996-06; citation_id=CR28 citation_journal_title=Bioinformatics; citation_title=Associating microbiome composition with environmental covariates using generalized UniFrac distances; citation_author=J Chen, K Bittinger, ES Charlson, C Hoffmann, J Lewis, GD Wu, RG Collman, FD Bushman, H Li; citation_volume=28; citation_issue=16; citation_publication_date=2012; citation_pages=2106-2113; citation_doi=10.1093/bioinformatics/bts342; citation_id=CR29 citation_title=Kernel methods for regression analysis of microbiome composition data; citation_publication_date=1998; citation_id=CR30; citation_author=J Chen; citation_author=H Li; citation_publisher=Springer citation_journal_title=Genetics; citation_title=A powerful and adaptive association test for rare variants; citation_author=W Pan, J Kim, Y Zhang, X Shen, P Wei; citation_volume=197; citation_issue=4; citation_publication_date=2014; citation_pages=1081-1095; citation_doi=10.1534/genetics.114.165035; citation_id=CR31 citation_title=Theoretical statistics; citation_publication_date=1974; citation_id=CR32; citation_author=DR Cox; citation_author=DV Hinkley; citation_publisher=Chapman & Hall citation_journal_title=Genet Epidemiol; citation_title=Asymptotic tests of association with multiple SNPs in linkage disequilibrium; citation_author=W Pan; citation_volume=33; citation_issue=6; citation_publication_date=2009; citation_pages=497-507; citation_doi=10.1002/gepi.20402; citation_id=CR33 citation_journal_title=Genet Epidemiol; citation_title=Comparison of statistical tests for disease association with rare variants; citation_author=B Basu, W Pan; citation_volume=35; citation_issue=7; citation_publication_date=2011; citation_pages=606-619; citation_doi=10.1002/gepi.20609; citation_id=CR34 citation_journal_title=Am J Hum Genet; citation_title=A general framework for detecting disease associations with rare variants in sequencing studies; citation_author=D Lin, Z Tang; citation_volume=89; citation_issue=3; citation_publication_date=2011; citation_pages=354-367; citation_doi=10.1016/j.ajhg.2011.07.015; citation_id=CR35 citation_journal_title=Hum Hered; citation_title=Test selection with application to detecting disease association with multiple SNPs; citation_author=W Pan, F Han, X Shen; citation_volume=69; citation_publication_date=2010; citation_pages=120-130; citation_doi=10.1159/000264449; citation_id=CR36 citation_journal_title=Genetics; citation_title=Empirical threshold values for quantitative trait mapping; citation_author=GA Churchill, RW Doerge; citation_volume=138; citation_issue=3; citation_publication_date=1994; citation_pages=963-971; citation_id=CR37 citation_journal_title=J Am Stat Assoc; citation_title=Use of ranks in one-criterion variance analysis; citation_author=WH Kruskal, WA Wallis; citation_volume=47; citation_issue=260; citation_publication_date=1952; citation_pages=583-621; citation_doi=10.1080/01621459.1952.10483441; citation_id=CR38 citation_journal_title=Microb Ecol Health Dis; citation_title=Analysis of composition of microbiomes: a novel method for studying microbial composition; citation_author=S Mandal, W Treuren, RA White, M Eggesbø, R Knight, D Peddada; citation_volume=26; citation_publication_date=2015; citation_pages=27663; citation_id=CR39 citation_journal_title=J R Stat Soc B; citation_title=Discovering the false discovery rate; citation_author=Y Benjamini; citation_volume=70; citation_issue=4; citation_publication_date=2010; citation_pages=405-416; citation_doi=10.1111/j.1467-9868.2010.00746.x; citation_id=CR40 citation_journal_title=J R Stat Soc B; citation_title=Controlling the false discovery rate: a practical and powerful approach to multiple testing; citation_author=Y Benjamini, Y Hochberg; citation_volume=57; citation_issue=1; citation_publication_date=1995; citation_pages=289-300; citation_id=CR41 citation_journal_title=Behav Brain Res; citation_title=Controlling the false discovery rate in behavior genetics research; citation_author=Y Benjamini, D Drai, G Elmer, N Kafkafi, I Golani; citation_volume=125; citation_publication_date=2001; citation_pages=279-284; citation_doi=10.1016/S0166-4328(01)00297-2; citation_id=CR42 citation_journal_title=Ann Statist; citation_title=The control of the false discovery rate in multiple testing under dependency; citation_author=Y Benjamini, D Yekutieli; citation_volume=29; citation_issue=4; citation_publication_date=2011; citation_pages=1165-1188; citation_id=CR43 citation_journal_title=J Math Model Algorithms; citation_title=Clustering rules: a comparison of partitioning and hierarchical clustering algorithms; citation_author=AP Reynolds, G Richards, B Iglesia, VJ Rayward-Smith; citation_volume=5; citation_publication_date=2006; citation_pages=474-504; citation_doi=10.1007/s10852-005-9022-1; citation_id=CR44 citation_title=Numerical taxonomy: the principles and practice of numerical classification; citation_publication_date=1973; citation_id=CR45; citation_author=PHA Sneath; citation_author=RR Sokal; citation_publisher=Freeman citation_journal_title=Genome Res; citation_title=MEGAN analysis of metagenomic data; citation_author=DH Huson, AF Auch, SC Schuster; citation_volume=17; citation_issue=3; citation_publication_date=2007; citation_pages=377-386; citation_doi=10.1101/gr.5969107; citation_id=CR46 Callahan BJ, Sankaran K, Fukuyama JA, McMurdie PJ, Holmes SP. Bioconductor workflow for microbiome data analysis: from raw reads to community analyses. F1000Research. 2016;5:1492.