Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights

PLoS Computational Biology - Tập 12 Số 7 - Trang e1004977
Edoardo Pasolli1, Duy Tin Truong1, Faizan Malik2, Levi Waldron2, Nicola Segata1
1Centre for Integrative Biology, University of Trento, Trento, Italy
2Graduate School of Public Health and Health Policy, City University of New York, New York, New York, United States of America

Tóm tắt

Từ khóa


Tài liệu tham khảo

2012, Structure, function and diversity of the healthy human microbiome, Nature, 486, 207, 10.1038/nature11234

I Cho, 2012, The human microbiome: at the interface of health and disease, Nature Rev Genet, 13, 260, 10.1038/nrg3182

D Gevers, 2012, The human microbiome project: a community resource for the healthy human microbiome, PLoS Biol, 10, e1001377, 10.1371/journal.pbio.1001377

C Manichanh, 2006, Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach, Gut, 55, 205, 10.1136/gut.2005.073817

DN Frank, 2007, Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases, PNAS, 104, 13780, 10.1073/pnas.0706625104

RE Ley, 2005, Obesity alters gut microbial ecology, PNAS, 102, 11070, 10.1073/pnas.0504978102

RE Ley, 2006, Microbial ecology: human gut microbes associated with obesity, Nature, 444, 1022, 10.1038/4441022a

J Qin, 2012, A metagenome-wide association study of gut microbiota in type 2 diabetes, Nature, 490, 55, 10.1038/nature11450

EA Eloe-Fadrosh, 2013, The human microbiome: from symbiosis to pathogenesis, Annu Rev Med, 64, 145, 10.1146/annurev-med-010312-133513

TS Furey, 2000, Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, 16, 906, 10.1093/bioinformatics/16.10.906

A Statnikov, 2005, A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis, Bioinformatics, 21, 631, 10.1093/bioinformatics/bti033

DR Rhodes, 2004, Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression, PNAS, 101, 9309, 10.1073/pnas.0401994101

L Waldron, 2014, Comparative meta-analysis of prognostic gene signatures for late-stage ovarian cancer, J Natl Cancer Inst, 106, dju049, 10.1093/jnci/dju049

M Hamady, 2009, Microbial community profiling for human microbiome projects: Tools, techniques, and challenges, Genome Res, 19, 1141, 10.1101/gr.085464.108

CA Lozupone, 2013, Meta-analyses of studies of the human microbiota, Genome Res, 23, 1704, 10.1101/gr.151803.112

D Gevers, 2014, The treatment-naive microbiome in new-onset Crohn’s disease, Cell Host Microbe, 15, 382, 10.1016/j.chom.2014.02.005

F Teng, 2015, Prediction of early childhood caries via spatial-temporal variations of oral microbiota, Cell Host Microbe, 18, 296, 10.1016/j.chom.2015.08.005

A Statnikov, 2013, A comprehensive evaluation of multicategory classification methods for microbiomic data, Microbiome, 1, 11, 10.1186/2049-2618-1-11

N Segata, 2013, Computational meta'omics for microbial community studies, Mol Syst Biol, 9, 666, 10.1038/msb.2013.22

S Sunagawa, 2013, Metagenomic species profiling using universal phylogenetic marker genes, Nature Methods, 10, 1196, 10.1038/nmeth.2693

DT Truong, 2015, MetaPhlAn2 for enhanced metagenomic taxonomic profiling, Nature Methods, 12, 902, 10.1038/nmeth.3589

AE Darling, 2014, PhyloSift: phylogenetic analysis of genomes and metagenomes, PeerJ, 2, e243, 10.7717/peerj.243

M Scholz, 2016, Strain-level microbial epidemiology and population genomics from shotgun meta’omics, Nature Methods, 13, 435, 10.1038/nmeth.3802

JM Norman, 2015, Disease-specific alterations in the enteric virome in inflammatory bowel disease, Cell, 160, 447, 10.1016/j.cell.2015.01.002

MO Sommer, 2009, Functional characterization of the antibiotic resistance reservoir in the human microflora, Science, 325, 1128, 10.1126/science.1176950

Y Hu, 2013, Metagenome-wide analysis of antibiotic resistance genes in a large cohort of human gut microbiota, Nat Commun, 4

TJ Sharpton, 2014, An introduction to the analysis of shotgun metagenomic data, Front Plant Sci, 5

Y Lan, 2013, Selecting age-related functional characteristics in the human gut microbiome, Microbiome, 1

M Arumugam, 2011, Enterotypes of the human gut microbiome, Nature, 473, 174, 10.1038/nature09944

N Segata, 2012, Metagenomic microbial community profiling using unique clade-specific marker genes, Nature Methods, 9, 811, 10.1038/nmeth.2066

E Le Chatelier, 2013, Richness of human gut microbiome correlates with metabolic markers, Nature, 500, 541, 10.1038/nature12506

FH Karlsson, 2013, Gut metagenome in European women with normal, impaired and diabetic glucose control, Nature, 498, 99, 10.1038/nature12198

N Qin, 2014, Alterations of the human gut microbiome in liver cirrhosis, Nature, 513, 59, 10.1038/nature13568

G Zeller, 2014, Potential of fecal microbiota for early‐stage detection of colorectal cancer, Mol Syst Biol, 10, 766, 10.15252/msb.20145645

J Qin, 2010, A human gut microbial gene catalogue established by metagenomic sequencing, Nature, 464, 59, 10.1038/nature08821

J Oh, 2014, NISC Comparative Sequencing Program. Biogeography and individuality shape function in the human skin metagenome, Nature, 514, 59, 10.1038/nature13786

J Qin, 2012, A metagenome-wide association study of gut microbiota in type 2 diabetes, Nature, 490, 55, 10.1038/nature11450

C Cortes, 1995, Support-vector networks, Machine Learning, 20, 273, 10.1007/BF00994018

L Breiman, 2001, Random forests, Machine Learning, 45, 5, 10.1023/A:1010933404324

R Tibshirani, 1996, Regression shrinkage and selection via the lasso, J R Stat Soc Series B, 58, 267, 10.1111/j.2517-6161.1996.tb02080.x

H Zou, 2005, Regularization and variable selection via the elastic net, J R Stat Soc Series B, 67, 301, 10.1111/j.1467-9868.2005.00503.x

S Haykin, 2004, Neural Networks. A comprehensive foundation

A Genkin, 2007, Large-scale Bayesian logistic regression for text categorization, Technometrics, 49, 291, 10.1198/004017007000000245

JS Bajaj, 2015, Decompensated cirrhosis and microbiome interpretation, Nature, 525, E1, 10.1038/nature14851

K Forslund, 2015, Disentangling type 2 diabetes and metformin treatment signatures in the human gut microbiota, Nature, 528, 262, 10.1038/nature15766

JR White, 2009, Statistical methods for detecting differentially abundant features in clinical metagenomic samples, PLoS Comput Biol, 5, e1000352, 10.1371/journal.pcbi.1000352

N Segata, 2011, Metagenomic biomarker discovery and explanation, Genome Biol, 12, R60, 10.1186/gb-2011-12-6-r60

G Ditzler, 2015, Fizzy: feature subset selection for metagenomics, BMC Bioinformatics, 16, 1, 10.1186/s12859-015-0793-8

DI Bolnick, 2014, Individual diet has sex-dependent effects on vertebrate gut microbiota, Nat Commun, 5

G Parmigiani, 2004, A cross-study comparison of gene expression studies for the molecular classification of lung cancer, Clin. Cancer Res, 10, 2922, 10.1158/1078-0432.CCR-03-0490

AM Riester, 2014, Risk prediction for late-stage ovarian cancer by meta-analysis of 1525 patient samples, J Natl Cancer Inst, dju048, 10.1093/jnci/dju048

T Hastie, 2001

S Abubucker, 2012, Metabolic reconstruction for metagenomic data and its application to the human microbiome, PLoS Comput Biol, 8, e1002358, 10.1371/journal.pcbi.1002358

MM Finucane, 2014, A taxonomic signature of obesity in the microbiome? Getting to the guts of the matter, PLoS ONE, 9, e84689, 10.1371/journal.pone.0084689

F Imhann, 2015, Proton pump inhibitors affect the gut microbiome, Gut

C Bernau, 2014, Cross-study validation for the assessment of prediction algorithms, Bioinformatics, 30, i105, 10.1093/bioinformatics/btu279

BJ Haas, 2011, Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons, Genome Res, 21, 494, 10.1101/gr.112730.110

2012, Evaluation of 16S rDNA-based community profiling for human microbiome research, PLoS One, 7, e39315, 10.1371/journal.pone.0039315

AW Walker, 2015, 16S rRNA gene-based profiling of the human infant gut microbiota is strongly influenced by sample processing and PCR primer choice, Microbiome, 3, 1, 10.1186/s40168-015-0087-4

DL Longo, 2016, Data Sharing, N Engl J Med, 374, 276, 10.1056/NEJMe1516564

F Pedregosa, 2011, Scikit-learn: Machine learning in Python, J Mach Learn Res, 12, 2825

V N Vapnik, 1998

S Knerr, 1990, Neurocomputing: Algorithms, Architectures and Applications, NATO ASI, 41

J Platt, 1999, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, Advances in large margin classifiers, 61

T-F Wu, 2004, Probability estimates for multi-class classification by pairwise coupling, J Mach Learn Res, 5, 975

L Breiman, 1984