ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq

PLoS ONE - Tập 8 Số 7 - Trang e67019
Andrew D. Fernandes1, Jean M. Macklaim2,3, Thomas Linn4, Gregor Reid2,4,5, Gregory B. Gloor2,3
1YouKaryote Genomics, London, Ontario, Canada
2Canadian Research & Development Centre for Probiotics, Lawson Health Research Institute, London, Ontario, Canada
3Department of Biochemistry, The University of Western Ontario, London, Ontario, Canada
4Department of Microbiology & Immunology, The University of Western Ontario, London, Ontario, Canada
5Department of Surgery, The University of Western Ontario, London, Ontario, Canada

Tóm tắt

Từ khóa


Tài liệu tham khảo

NC Roy, 2011, A comparison of analog and next-generation transcriptomic tools for mammalian studies, Brief Funct Genomics, 10, 135, 10.1093/bfgp/elr005

JE Crawford, 2010, De novo transcriptome sequencing in Anopheles funestus using Illumina RNA-Seq technology, PLoS One, 5, e14202, 10.1371/journal.pone.0014202

MG Grabherr, 2011, Full-length transcriptome assembly from RNA-Seq data without a reference genome, Nat Biotechnol, 29, 644, 10.1038/nbt.1883

C Trapnell, 2009, Tophat: discovering splice junctions with RNA-Seq, Bioinformatics, 25, 1105, 10.1093/bioinformatics/btp120

K Wang, 2010, Mapsplice: accurate mapping of RNA-Seq reads for splice junction discovery, Nucleic Acids Res, 38, e178, 10.1093/nar/gkq622

D Kim, 2011, Tophat-fusion: an algorithm for discovery of novel fusion transcripts, Genome Biol, 12, R72, 10.1186/gb-2011-12-8-r72

AM Smith, 2010, Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples, Nucleic Acids Res, 38, e142, 10.1093/nar/gkq368

H van Bakel, 2010, Most "dark matter" transcripts are associated with known genes, PLoS Biol, 8, e1000371, 10.1371/journal.pbio.1000371

LM McIntyre, 2011, Rna-Seq: technical variability and sampling, BMC Genomics, 12, 293, 10.1186/1471-2164-12-293

Z Wu, 2009, A review of statistical methods for preprocessing oligonucleotide microarrays, Stat Methods Med Res, 18, 533, 10.1177/0962280209351924

Pachter L (2011) Models for transcript quantification from RNA-Seq. ArXiv 1104.3889.

S Nakagawa, 2007, Effect size, confidence interval and statistical significance: a practical guide for biologists, Biol Rev Camb Philos Soc, 82, 591, 10.1111/j.1469-185X.2007.00027.x

DH Parks, 2010, Identifying biologically relevant differences between metagenomic communities, Bioinformatics, 26, 715, 10.1093/bioinformatics/btq041

K Pearson, 1896, Mathematical contributions to the theory of evolution. { on a form of spurious correlation which may arise when indices are used in the measurement of organs, Proceedings ofthe Royal Society of London, 60, 489, 10.1098/rspl.1896.0076

J Aitchison, 2005, Compositional data analysis: Where are we and where should we be heading?, Mathematical Geology, 37, 829, 10.1007/s11004-005-7383-7

Pawlowsky-Glahn V, Egozcue JJ (2006) Compositional data and their analysis: an introduction. Geological Society, London, Special Publications <volume>264</volume>: : 1–10.

J Egozcue, 2005, Groups of parts and their balances in compositional data analysis, Mathematical Geology, 37, 795, 10.1007/s11004-005-7381-9

JJ Egozcue, 2003, Isometric logratio transformations for compositional data analysis. mathematical geology, Math Geol, 35, 279, 10.1023/A:1023818214614

JC Marioni, 2008, Rna-seq: an assessment of technical reproducibility and comparison with gene expression arrays, Genome Res, 18, 1509, 10.1101/gr.079558.108

PN Polymenakou, 2009, Phylogenetic diversity of sediment bacteria from the southern cretan margin, eastern mediterranean sea, Syst Appl Microbiol, 32, 17, 10.1016/j.syapm.2008.09.006

AZ Rosenthal, 2011, Rna-Seq reveals cooperative metabolic interactions between two termite-gut spirochete species in co-culture, ISME J, 5, 1133, 10.1038/ismej.2011.3

MD Robinson, 2008, Small-sample estimation of negative binomial dispersion, with applications to sage data, Biostatistics, 9, 321, 10.1093/biostatistics/kxm030

S Anders, 2010, Differential expression analysis for sequence count data, Genome Biol, 11, R106, 10.1186/gb-2010-11-10-r106

Newey WK, McFadden D (1994) Large sample estimation and hypothesis testing. In:Engle R, McFadden D, editors, Handbook of Econometrics, Elsevier Science, volume 4, chapter 35. pp. 2111–2245.

Jaynes ET, Bretthorst GL (2003) Probability theory: the logic of science. Cambridge, UK:Cambridge University Press. URL <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://www.loc.gov/catdir/samples/cam033/2002071486.html" xlink:type="simple">http://www.loc.gov/catdir/samples/cam033/2002071486.html</ext-link>.

Bela A Frigyik AK, Gupta MR (2010) Introduction to the Dirichlet distribution and related processes. Technical Report UWEETR-2010-0006, Department of Electrical Engineering, University of Washington. URL <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://www.ee.washington.edu/research/guptalab/publications/UWEETR-2010-0006.pdf" xlink:type="simple">http://www.ee.washington.edu/research/guptalab/publications/UWEETR-2010-0006.pdf</ext-link>.

J Berger, 1992, Ordered group reference priors with application to the multinomial problem, Biometrika, 79, 25, 10.1093/biomet/79.1.25

J Bernardo, 2005, Reference analysis, Bayesian Thinking, Modeling and Computation, 25, 17, 10.1016/S0169-7161(05)25002-2

JO Berger, 2009, The formal definition of reference priors, Annals of Statistics, 37, 905, 10.1214/07-AOS587

L Wang, 2010, DEGSeq: an R package for identifying differentially expressed genes from RNA-Seq data, Bioinformatics, 26, 136, 10.1093/bioinformatics/btp612

Macklaim MJ, Fernandes DA, Di Bella MJ, Hammond JA, Reid G, <etal>et al</etal>.. (2013) Comparative meta-RNA-Seq of the vaginal microbiota and differential expression by lactobacillus iners in health and dysbiosis. Microbiome doi: 10.1186/2049-2618-1-12.

Langmead B (2010) Aligning short sequencing reads with bowtie. Curr Protoc Bioinformatics Chapter 11: Unit 11.7.

J Friedman, 2012, Inferring correlation networks from genomic survey data, PLoS Comput Biol, 8, e1002687, 10.1371/journal.pcbi.1002687

PS La Rosa, 2012, Hypothesis testing and power calculations for taxonomic-based human microbiome data, PLoS One, 7, e52078, 10.1371/journal.pone.0052078

I Holmes, 2012, Dirichlet multinomial mixtures: generative models for microbial metagenomics, PLoS One, 7, e30126, 10.1371/journal.pone.0030126

R Blekhman, 2010, Sex-specific and lineage-specific alternative splicing in primates, Genome Research, 20, 180, 10.1101/gr.099226.109

Altman DG, Bland JM (1983) Measurement in medicine: The analysis of method comparison studies. Journal of the Royal Statistical Society Series D (The Statistician)<volume>32</volume> :pp. 307–317.

MD Robinson, 2010, edgeR: a bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, 26, 139, 10.1093/bioinformatics/btp616

C Trapnell, 2010, Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation, Nat Biotechnol, 28, 511, 10.1038/nbt.1621

M Mols, 2010, Comparative analysis of transcriptional and physiological responses of bacillus cereus to organic and inorganic acid shocks, Int J Food Microbiol, 137, 13, 10.1016/j.ijfoodmicro.2009.09.027

A Sboner, 2011, The real cost of sequencing: higher than you think!, Genome Biol, 12, 125, 10.1186/gb-2011-12-8-125

VM Kvam, 2012, A comparison of statistical methods for detecting differentially expressed genes from RNA-Seq data, Am J Bot, 99, 248, 10.3732/ajb.1100340

M Hamady, 2009, Microbial community profiling for human microbiome projects: Tools, techniques, and challenges, Genome Res, 19, 1141, 10.1101/gr.085464.108

B Rodriguez-Brito, 2006, An application of statistics to comparative metagenomics, BMC Bioinformatics, 7, 162, 10.1186/1471-2105-7-162

Efron B, Tibshirani R (1993) An introduction to the bootstrap, volume 57. New York:Chapman &amp; Hall. URL <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://www.loc.gov/catdir/enhancements/fy0730/93004489-d.html" xlink:type="simple">http://www.loc.gov/catdir/enhancements/fy0730/93004489-d.html</ext-link>.

JA Gilbert, 2008, Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities, PLoS One, 3, e3042, 10.1371/journal.pone.0003042

JA Gilbert, 2010, Metagenomes and metatranscriptomes from the l4 long-term coastal monitoring station in the western English Channel, Stand Genomic Sci, 3, 183, 10.4056/sigs.1202536

J McCarren, 2010, Microbial community transcriptomes reveal microbes and metabolic pathways associated with dissolved organic matter turnover in the sea, Proc Natl Acad Sci U S A, 107, 16420, 10.1073/pnas.1010732107

JR White, 2009, Statistical methods for detecting differentially abundant features in clinical metagenomic samples, PLoS Comput Biol, 5, e1000352, 10.1371/journal.pcbi.1000352

JJ Faith, 2011, Predicting a human gut microbiota&apos;s response to diet in gnotobiotic mice, Science, 333, 101, 10.1126/science.1206025

PJ Turnbaugh, 2010, Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins, Proc Natl Acad Sci U S A, 107, 7503, 10.1073/pnas.1002355107

E Kristiansson, 2009, ShotgunFunctionalizeR: an R-package for functional comparison of metagenomes, Bioinformatics, 25, 2737, 10.1093/bioinformatics/btp508

TJ Hardcastle, 2013, Empirical bayesian analysis of paired high-throughput sequencing data with a beta-binomial distribution, BMC Bioinformatics, 14, 135, 10.1186/1471-2105-14-135

I Nookaew, 2012, A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae, Nucleic Acids Res, 40, 10084, 10.1093/nar/gks804

RP Nugent, 1991, Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation, J Clin Microbiol, 29, 297, 10.1128/JCM.29.2.297-301.1991

W Li, 2006, CD-Hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22, 1658, 10.1093/bioinformatics/btl158

Oliveros JC (2007). Venny. an interactive tool for comparing lists with Venn diagrams. URL<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://bioinfogp.cnb.csic.es/tools/venny/index.html" xlink:type="simple">http://bioinfogp.cnb.csic.es/tools/venny/index.html</ext-link>.