Normalization of RNA-seq data using factor analysis of control genes or samples
Tóm tắt
Từ khóa
Tài liệu tham khảo
Bullard, J., Purdom, E., Hansen, K. & Dudoit, S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11, 94 (2010).
Risso, D., Schwartz, K., Sherlock, G. & Dudoit, S. GC-content normalization for RNA-Seq data. BMC Bioinformatics 12, 480 (2011).
Dillies, M.-A. et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14, 671–683 (2013).
Robinson, M.D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).
Hansen, K.D., Irizarry, R.A. & Zhijin, W. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 13, 204–216 (2012).
Sun, Z. & Zhu, Y. Systematic comparison of RNA-Seq normalization methods using measurement error models. Bioinformatics 28, 2584–2591 (2012).
Yang, Y.H. et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15 (2002).
Oshlack, A., Emslie, D., Corcoran, L.M. & Smyth, G.K. Normalization of boutique two-color microarrays with a high proportion of differentially expressed probes. Genome Biol. 8, R2 (2007).
Wu, D. et al. The use of miRNA microarrays for the analysis of cancer samples with global miRNA decrease. RNA 19, 876–888 (2013).
Risso, D., Massa, M.S., Chiogna, M. & Romualdi, C. A modified LOESS normalization applied to microRNA arrays: a comparative evaluation. Bioinformatics 25, 2685–2691 (2009).
Baker, S.C. et al. The external RNA controls consortium: a progress report. Nat. Methods 2, 731–734 (2005).
Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).
Bolstad, B.M., Irizarry, R.A., Astrand, M. & Speed, T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193 (2003).
Cleveland, W.S. & Devlin, S.J. Locally weighted regression: an approach to regression analysis by local fitting. JASA 83, 596–610 (1988).
Qing, T., Yu, Y., Du, T. & Shi, L. mRNA enrichment protocols determine the quantification characteristics of external RNA spike-in controls in RNA-Seq studies. Sci. China Life Sci. 56, 134–142 (2013).
SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 10.1038/nbt.2957 (24 August 2014).
Canales, R.D. et al. Evaluation of DNA microarray results with quantitative gene expression platforms. Nat. Biotechnol. 24, 1115–1122 (2006).
Ferreira, T. et al. Silencing of odorant receptor genes by G Protein βγ signaling ensures the expression of one odorant receptor per olfactory sensory neuron. Neuron 81, 847–859 (2014).
Gagnon-Bartsch, J. & Speed, T. Using control genes to correct for unwanted variation in microarray data. Biostatistics 13, 539–552 (2012).
Gagnon-Bartsch, J., Jacob, L. & Speed, T.P. Removing unwanted variation from high dimensional data with negative controls. Tech. Rep. 820, Department of Statistics, University of California, Berkeley (2013).
Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061–1068 (2008).
ENCODE Project Consortium. The ENCODE (ENCyclopedia of DNA elements) project. Science 306, 636–640 (2004).
Leek, J.T. & Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724–1735 (2007).
't Hoen, P. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).
Jacob, L., Gagnon-Bartsch, J. & Speed, T.P. Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed. Tech. Rep. 818, Department of Statistics, University of California, Berkeley (2013).
Tang, F., Lao, K. & Surani, M.A. Development and applications of single-cell transcriptome analysis. Nat. Methods 8, S6–S11 (2011).
Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods 10, 1093–1095 (2013).
Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. JASA 74, 829–836 (1979).
Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).
Listgarten, J., Kadie, C., Schadt, E.E. & Heckerman, D. Correction for hidden confounders in the genetic analysis of gene expression. Proc. Natl. Acad. Sci. USA 107, 16465–16470 (2010).
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).
Smyth, G.K. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3, 3 (2004).
Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc., B 57, 289–300 (1995).
