Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments

BMC Bioinformatics - Tập 11 Số 1 - 2010
James Bullard1, Elizabeth Purdom2, Kasper D. Hansen1, Sandrine Dudoit2
1Division of Biostatistics, University of California, Berkeley, Berkeley, CA, USA
2Department of Statistics, University of California, Berkeley, Berkeley, CA, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Chiang DY, Getz G, Jaffe DB, O'Kelly MJT, Zhao X, Carter SL, Russ C, Nusbaum C, Meyerson M, Lander ES: High-resolution mapping of copy-number alterations with massively parallel sequencing. Nature Methods 2009, 6: 99–103. 10.1038/nmeth.1276

Dohm JC, Lottaz C, Borodina T, Himmelbauer H: Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Research 2008, 36(16):e105. 10.1093/nar/gkn425

Hoen PAC, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RHAM, de Menezes RX, Boer JM, van Ommen GJB, den Dunnen JT: Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Research 2008, 36(21):e141. 10.1093/nar/gkn705

Lee A, Hansen KD, Bullard J, Dudoit S, Sherlock G: Novel low abundance and transient RNAs in yeast revealed by tiling microarrays and ultra high-throughput sequencing are not conserved across closely related yeast species. PLoS Genetics 2008, 4(12):e1000299. 10.1371/journal.pgen.1000299

Li H, Lovci MT, Kwon YS, Rosenfeld MG, Fu XD, Yeo GW: Determination of tag density required for digital transcriptome analysis: Application to an androgen-sensitive prostate cancer model. PNAS 2008, 105(51):20179–20184. 10.1073/pnas.0807121105

Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Research 2008, 18(9):1509–1517. 10.1101/gr.079558.108

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 2008, 5(7):621–628. 10.1038/nmeth.1226

Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 2008, 320(5881):1344–1349. 10.1126/science.1158441

Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. Nature 2008, 456(7221):470–476. 10.1038/nature07509

MAQC Consortium: The MicroArray Quality Control (MAQC) project shows inter-andintraplatform reproducibility of gene expression measurements. Nature Biotechnology 2006, 24(9):1151–1161. 10.1038/nbt1239

Oshlack A, Wakeffeld MJ: Transcript length bias in RNA-seq data confounds systems biology. Biology Direct 2009., 4(14):

Illumina:Sequencing Analysis Software User Guide For Pipeline Version 1.3 and CASAVA Version 1.0 T. Illumina, Inc.; 2008. [Part # 1005359 Rev. A] [ http://icom.illumina.com/icom/software.ilmn?id=277 ] [Part # 1005359 Rev. A]

Canales RD, Luo Y, Willey JC, Austermiller B, Barbacioru CC, Boysen C, Hunkapiller K, Jensen RV, Knight CR, Lee KY, Ma Y, Maqsodi B, Papallo A, Peters EH, Poulter K, Ruppel PL, Samaha RR, Shi L, Yang W, Zhang L, Goodsaid FM: Evaluation of DNA microarray results with quantitative gene expression platforms. Nature Biotechnology 2006, 24(9):1115–1122. 10.1038/nbt1236

Illumina:Preparing Samples for Sequencing mRNA. Ilumina, Inc.; 2009. [Part # 1004898 Rev. A] [ http://icom.illumina.com/icom/software.ilmn?id=277 ] [Part # 1004898 Rev. A]

Bentley DR, Balasubramanian S, Swerdlow HP, et al.: Accurate whole human genome sequencing using reversible terminator chemistry. Nature 2008, 456(7218):53–59. 10.1038/nature07517

Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 2009, 10(3):R25. 10.1186/gb-2009-10-3-r25

Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research 1998, 8(3):186–194.

Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data. Biostatistics 2003, 4(2):249–264. 10.1093/biostatistics/4.2.249

Taub MA: Analysis of high-throughput biological data: some statistical problems in RNA-seq and mouse genotyping. PhD thesis. Department of Statistics, UC Berkeley; 2009.

Durinck S, Bullard J, Spellman PT, Dudoit S: GenomeGraphs: integrated genomic data visualization with R. BMC Bioinformatics 2009, 10: Article 2. 10.1186/1471-2105-10-2

Lu J, Tomfohr JK, Kepler TB: Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach. BMC Bioinformatics 2005, 6: 165. 10.1186/1471-2105-6-165

Robinson MD, Smyth GK: Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 2007, 23(21):2881–2887. 10.1093/bioinformatics/btm453

Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4(1465–4644 (Print)):249–64. 10.1093/biostatistics/4.2.249