Beyond comparisons of means: understanding changes in gene expression at the single-cell level

Genome Biology - Tập 17 - Trang 1-14 - 2016
Catalina A. Vallejos1,2, Sylvia Richardson1, John C. Marioni2,3
1MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, UK
2EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
3Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK

Tóm tắt

Traditional differential expression tools are limited to detecting changes in overall expression, and fail to uncover the rich information provided by single-cell level data sets. We present a Bayesian hierarchical model that builds upon BASiCS to study changes that lie beyond comparisons of means, incorporating built-in normalization and quantifying technical artifacts by borrowing information from spike-in genes. Using a probabilistic approach, we highlight genes undergoing changes in cell-to-cell heterogeneity but whose overall expression remains unchanged. Control experiments validate our method’s performance and a case study suggests that novel biological insights can be revealed. Our method is implemented in R and available at https://github.com/catavallejos/BASiCS .

Tài liệu tham khảo

Zeisel A, Muñoz-Manchado AB, Codeluppi S, Lönnerberg P, La Manno G, Juréus A, et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science. 2015; 347(6226):1138–42. Jaitin DA, Kenigsberg E, Keren-Shaul H, Elefant N, Paul F, Zaretsky I, et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science. 2014; 343(6172):776–9. Patel AP, Tirosh I, Trombetta JJ, Shalek AK, Gillespie SM, Wakimoto H, et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science. 2014; 344(6190):1396–401. Brennecke P, Anders S, Kim JK, Kołodziejczyk AA, Zhang X, Proserpio V, et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods. 2013; 10(11):1093–5. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010; 26(1):139–40. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014; 15(12):550. Kharchenko PV, Silberstein L, Scadden DT. Bayesian approach to single-cell differential expression analysis. Nat Methods. 2014; 11(7):740–2. Vallejos CA, Marioni JC, Richardson S. BASiCS: Bayesian analysis of single-cell sequencing data. PLoS Comput Biol. 2015; 11(6):1004333. Kolodziejczyk AA, Kim JK, Tsang JC, Ilicic T, Henriksson J, Natarajan KN, et al. Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell. 2015; 17(4):471–85. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40(10). Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, et al.Synthetic spike-in standards for RNA-seq experiments. Genome Res. 2011; 21(9):1543–51. Lovén J, Orlando DA, Sigova AA, Lin CY, Rahl PB, Burge CB, et al.Revisiting global gene expression analysis. Cell. 2012; 151(3):476–82. Newton MA, Noueiry A, Sarkar D, Ahlquist P. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004; 5(2):155–76. McCarthy DJ, Smyth GK. Testing significance relative to a fold-change threshold is a treat. Bioinformatics. 2009; 25(6):765–71. Finak G, McDavid A, Yajima M, Deng J, Gersuk V, Shalek AK, et al.Mast: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015; 16(1):1–13. Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, et al.Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol. 2015; 33:155–60. Grün D, Kester L, van Oudenaarden A. Validation of noise models for single-cell transcriptomics. Nat Methods. 2014; 11(6):637–40. Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M, et al.Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods. 2014; 11(2):163–6. Darzynkiewicz Z, Crissman H, Traganos F, Steinkamp J. Cell heterogeneity during the cell cycle. J Cell Physiol. 1982; 113(3):465–74. Clemens A. Protein phosphorylation in cell growth regulation, 1st ed. Amsterdam: Harwood Academic Publishers; 1996. Boddy MN, Russell P. DNA replication checkpoint. Curr Biol. 2001; 11(23):953–6. Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, et al.Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. 2015; 161(5):1187–201. Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, et al.Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015; 161(5):1202–14. Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc Ser B Methodol. 2009; 71(2):319–92. Roberts GO, Rosenthal JS. Examples of adaptive MCMC. J Comput Graph Stat. 2009; 18(2):349–67. Bochkina N, Richardson S. Tail posterior probability for inference in pairwise and multiclass gene expression data. Biometrics. 2007; 63(4):1117–25. R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2014. Eddelbuettel D, François R, Allaire J, Chambers J, Bates D, Ushey K. Rcpp: Seamless R and C++ integration. J Stat Softw. 2011; 40(8):1–18.