Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics

Springer Science and Business Media LLC - Tập 19 - Trang 1-16 - 2018
Kelly Street1,2, Davide Risso3, Russell B. Fletcher4, Diya Das4,5, John Ngai4,6,7, Nir Yosef8,2, Elizabeth Purdom9,2, Sandrine Dudoit1,9,2,5
1Division of Biostatistics, School of Public Health, University of California, Berkeley, USA
2Center for Computational Biology, University of California, Berkeley, USA
3Division of Biostatistics and Epidemiology, Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, USA
4Department of Molecular and Cell Biology, University of California, Berkeley, USA
5Berkeley Institute for Data Science, University of California, Berkeley, USA
6Helen Wills Neuroscience Institute, University of California, Berkeley, USA
7QB3 Berkeley Functional Genomics Laboratory, Berkeley, USA
8Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, USA
9Department of Statistics, University of California, Berkeley, USA

Tóm tắt

Single-cell transcriptomics allows researchers to investigate complex communities of heterogeneous cells. It can be applied to stem cells and their descendants in order to chart the progression from multipotent progenitors to fully differentiated cells. While a variety of statistical and computational methods have been proposed for inferring cell lineages, the problem of accurately characterizing multiple branching lineages remains difficult to solve. We introduce Slingshot, a novel method for inferring cell lineages and pseudotimes from single-cell gene expression data. In previously published datasets, Slingshot correctly identifies the biological signal for one to three branching trajectories. Additionally, our simulation study shows that Slingshot infers more accurate pseudotimes than other leading methods. Slingshot is a uniquely robust and flexible tool which combines the highly stable techniques necessary for noisy single-cell data with the ability to identify multiple trajectories. Accurate lineage inference is a critical step in the identification of dynamic temporal gene expression.

Tài liệu tham khảo

Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA. The technology and biology of single-cell RNA sequencing. Mol Cell. 2015; 58(4):610–20. Wagner A, Regev A, Yosef N. Revealing the vectors of cellular identity with single-cell genomics. Nat Biotechnol. 2016; 34(11):1145–60. Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol. 2014; 4(32):381–91. Bendall S, Davis KL, Amir ED, Tadmor MD, Simonds EF, Chen TJ, Shenfeld DK, Nolan GP, Pe’er D. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell. 2014; 157(3):714–25. Campbell K, Ponting CP, Webber C. Laplacian eigenmaps and principal curves for high resolution pseudotemporal ordering of single-cell RNA-seq profiles. Technical report, Functional Genomics Unit, MRC, University of Oxford, UK. 2015. biorxiv.org/content/early/2015/09/18/027219. Haghverdi L, Büttner M, Wolf FA, Buettner F, Theis FJ. Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods. 2016; 13(10):845–8. https://doi.org/10.1038/nmeth.3971. Accessed 26 July 2017. Ji Z, Ji H. TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. Nucleic Acids Res. 2016; 44(13):117. https://doi.org/10.1093/nar/gkw430. Qiu X, Hill A, Packer J, Lin D, Ma Y-A, Trapnell C. Single-cell mRNA quantification and differential analysis with Census. Nat Methods. 2017; 14(3):309–15. https://doi.org/10.1038/nmeth.4150. Accessed 26 Sept 2017. Setty M, Tadmor MD, Reich-Zeliger S, Angel O, Salame TM, Kathail P, Choi K, Bendall S, Friedman N, Pe’er D. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat Biotechnol. 2016; 34(6):637–45. https://doi.org/10.1038/nbt.3569. Accessed 18 Nov 2016. Shin J, Berg DA, Zhu Y, Shin JY, Song J, Bonaguidi MA, Enikolopov G, Nauen DW, Christian KM, Ming G, Song H. Single-cell RNA-Seq with Waterfall reveals molecular cascades underlying adult neurogenesis. Cell Stem Cell. 2015; 17(3):360–72. Saelens W, Cannoodt R, Todorov H, Saeys Y. A comparison of single-cell trajectory inference methods: towards more accurate and robust tools. 2018:276907. https://www.biorxiv.org/content/early/2018/03/05/276907. Accessed 3 May 2018. Bacher R, Kendziorski C. Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol. 2016; 17(63):63. Hastie T, Stuetzle W. Principal curves. J Am Stat Assoc. 1989; 84(406):502–16. Belkin M, Niyogi P. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Comput. 2003; 15(6):1373–96. https://doi.org/10.1162/089976603321780317. Lönnberg T., Svensson V, James KR, Fernandez-Ruiz D, Sebina I, Montandon R, Soon MSF, Fogg LG, Nair AS, Liligeto UN, Stubbington MJT, Ly L, Bagger FO, Zwiessele M, Lawrence ND, Souza-Fonseca-Guimaraes F, Bunn PT, Engwerda CR, Heath WR, Billker O, Stegle O, Haque A, Teichmann SA. Single-cell RNA-seq and computational analysis using temporal mixture modeling resolves TH1/TFH fate bifurcation in malaria. Sci Immunol. 2017; 2(9):2192. https://doi.org/10.1126/sciimmunol.aal2192. Accessed 29 Sept 2017. Reid JE, Wernisch L. Pseudotime estimation: deconfounding single cell time series. Bioinforma (Oxford, England). 2016; 32(19):2973–80. https://doi.org/10.1093/bioinformatics/btw372. Campbell KR, Yau C. Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers. Wellcome Open Res. 2017; 2. https://doi.org/10.12688/wellcomeopenres.11087.1. Maaten LVD, Hinton G. Visualizing Data using t-SNE. J Mach Learn Res. 2008; 9:2579–605. https://doi.org/10.1007/s10479-011-0841-3. Perraudeau F, Risso D, Street K, Purdom E, Dudoit S. Bioconductor workflow for single-cell rna sequencing: Normalization, dimensionality reduction, clustering, and lineage inference. F1000Research. 2017; 6:1158. Cole MB, Risso D, Wagner A, DeTomaso D, Ngai J, Purdom E, Dudoit S, Yosef N. Performance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq. 2017. bioRxiv 235382. Risso D, Perraudeau F, Gribkova S, Dudoit S. JP Vert A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun. 2018; 9(1):284. Risso D, Purvis L, Fletcher R, Das D, Ngai J, Dudoit S, Purdom E. ClusterExperiment and RSEC: A Bioconductor package and framework for clustering of single-cell and other large gene expression datasets. 2018. bioRxiv 280545. Chubb JR, Trcek T, Shenoy SM, Singer RH. Transcriptional Pulsing of a Developmental Gene. Curr Biol. 2006; 16(10):1018–25. https://doi.org/10.1016/j.cub.2006.03.092. Accessed 10 Apr 2017. Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. Stochastic mRNA Synthesis in Mammalian Cells. PLoS Biol. 2006; 4(10):309. https://doi.org/10.1371/journal.pbio.0040309. Accessed 10 Apr 2017. Kharchenko PV, Silberstein L, Scadden DT. Bayesian approach to single-cell differential expression analysis. Nat Methods. 2014; 11(7):740–2. https://doi.org/10.1038/nmeth.2967. Accessed 10 Apr 2017. Fletcher RB, Das D, Gadye L, Street K, Baudhuin A, Risso D, Wagner A, Cole MB, Flores Q, Choi YG, Yosef N, Purdom E, Dudoit S, Ngai J. Deconstructing Olfactory Stem Cell Trajectories at Single-Cell Resolution. Cell Stem Cell. 2017; 20(6):817–30. https://doi.org/10.1016/j.stem.2017.04.003. Accessed 11 May 2017. Mahalanobis PC. On the generalized distance in statistics. Proc Natl Inst Sci (Calcutta). 1936; 2:49–55. Zappia L, Phipson B, Oshlack A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 2017; 18:174. https://doi.org/10.1186/s13059-017-1305-0. Soneson C, Robinson MD. Bias, Robustness And Scalability In Differential Expression Analysis Of Single-Cell RNA-Seq Data. bioRxiv. 2017:143289. https://doi.org/10.1101/143289. Accessed 22 Nov 2017. Campbell KR, Yau C. Order Under Uncertainty: Robust Differential Expression Analysis Using Probabilistic Models for Pseudotime Inference. PLoS Comput Biol. 2016; 12(11):1005212. https://doi.org/10.1371/journal.pcbi.1005212. Accessed 29 Sept 2017. Stevant I, Neirijnck Y, Borel C, Escoffier J, Smith LB, Antonarakis SE, Dermitzakis ET, Nef S. Deciphering cell lineage specification during male sex determination with single-cell RNA sequencing. bioRxiv. 2017:190264. https://doi.org/10.1101/190264. Accessed 21 Sept 2017. Scrucca L, Fop M, Murphy TB, Raftery AE. mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 2016; 8(1):205–33.