Single-Cell Transcriptomics Bioinformatics and Computational Challenges

Olivier Poirion1, Xun Zhu1,2, Travers Ching1,2, Lana X. Garmire1
1Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
2Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, HI, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Aaron, 2016, Pooling across cells to normalize single-cell RNA sequencing data with many zero counts, Genome Biol., 17, 75, 10.1186/s13059-016-0947-7

Amir, 2013, viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia, Nat. Biotechnol., 31, 545, 10.1038/nbt.2594

Anders, 2010, Differential expression analysis for sequence count data, Genome Biol., 11, R106, 10.1186/gb-2010-11-10-r106

Anders, 2014, HTSeq—a python framework to work with high-throughput sequencing data, Bioinformatics, 31, 166, 10.1093/bioinformatics/btu638

Andrews, 2010, FastQC: a quality control tool for high throughput sequence data

Balasubramanian, 2002, The isomap algorithm and topological stability, Science, 295, 7, 10.1126/science.295.5552.7a

Barron, 2016, Identifying and removing the cell-cycle effect from single-cell rna-sequencing data. arXiv:1605.04492

Bendall, 2014, Single-cell trajectory detection uncovers progression and regulatory coordination in human b cell development, Cell, 157, 714, 10.1016/j.cell.2014.04.005

Beyer, 1999, When Is ‘Nearest Neighbor’ Meaningful?, DATABASE Theory–ICDT'99, 217, 10.1007/3-540-49257-7_15

Bolger, 2014, Trimmomatic: a flexible trimmer for illumina sequence data, Bioinformatics, 30, 2114, 10.1093/bioinformatics/btu170

Bose, 2015, Scalable microfluidics for single cell rna printing and sequencing, Genome Biol., 16, 120, 10.1186/s13059-015-0684-3

Bray, 2016, Near-optimal probabilistic RNA-seq quantification, Nat. Biotechnol., 34, 525, 10.1038/nbt.3519

Brennecke, 2013, Accounting for technical noise in single-cell RNA-seq experiments, Nat. Methods, 10, 1093, 10.1038/nmeth.2645

Buettner, 2015, Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells, Nat. Biotechnol., 33, 55, 10.1038/nbt.3102

Buettner, 2012, A novel approach for resolving differences in single-cell gene expression patterns from zygote to blastocyst, Bioinformatics, 28, i626, 10.1093/bioinformatics/bts385

Campbell, 2015, Laplacian eigenmaps and principal curves for high resolution pseudotemporal ordering of single-cell rna-seq profiles, bioRxiv, 27219, 10.1101/027219

Chandramohan, 2013, Benchmarking RNA-Seq quantification tools, Engineering In Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE, 647, 10.1109/EMBC.2013.6609583

Ching, 2016, Pan-Cancer analyses reveal long intergenic non-coding rnas relevant to tumor diagnosis, subtyping and prognosis, EBioMedicine, 7, 62, 10.1016/j.ebiom.2016.03.023

Cox, 2010, SolexaQA: at-a-glance quality assessment of illumina second-generation sequencing data, BMC Bioinformatics, 11, 485, 10.1186/1471-2105-11-485

der Maaten, 2008, Visualizing data using T-SNE, J. Mach. Learn. Res., 9, 2579

Dey, 2015, Integrated genome and transcriptome sequencing of the same cell, Nat. Biotechnol., 33, 285, 10.1038/nbt.3129

Diaz, 2016, SCell: integrated analysis of single-cell RNA-Seq data, Bioinformatics, 32, 2219, 10.1093/bioinformatics/btw201

Ding, 2015, Normalization and noise reduction for single cell RNA-Seq experiments, Bioinformatics, 31, 2225, 10.1093/bioinformatics/btv122

Dobin, 2015, Mapping RNA-seq reads with STAR, Curr. Protoc. Bioinform., 51, 11.14.1, 10.1002/0471250953.bi1114s51

Engström, 2013, Systematic evaluation of spliced alignment programs for RNA-seq data, Nat. Methods, 10, 1185, 10.1038/nmeth.2722

Fan, 2016, Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis, Nat. Methods, 13, 241, 10.1038/nmeth.3734

Finak, 2015, MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data, Genome Biol., 16, 278, 10.1186/s13059-015-0844-5

Fonseca, 2014, RNA-Seq gene profiling-a systematic empirical comparison, PloS ONE, 9, e107026, 10.1371/journal.pone.0107026

Fortunato, 2010, Community detection in graphs, Phys. Rep., 486, 75, 10.1016/j.physrep.2009.11.002

Freeman, 2016, Single-Cell RNA-seq reveals activation of unique gene groups as a consequence of stem cell-parenchymal cell fusion, Sci. Rep., 6, 23270, 10.1038/srep23270

Gao, 2016, Integrative single-cell transcriptomics reveals molecular networks defining neuronal maturation during postnatal neurogenesis, Cereb. Cortex, 10.1093/cercor/bhw040

Grün, 2015, Design and analysis of single-cell sequencing experiments, Cell, 163, 799, 10.1016/j.cell.2015.10.039

Guo, 2015, SINCERA: a Pipeline for Single-Cell RNA-Seq profiling analysis, PLoS Comput. Biol., 11, e1004575, 10.1371/journal.pcbi.1004575

Haghverdi, 2015, Diffusion maps for high-dimensional single-cell analysis of differentiation data, Bioinformatics, 31, 2989, 10.1093/bioinformatics/btv325

Han, 2014, Co-detection and sequencing of genes and transcripts from the same single cells facilitated by a microfluidics platform, Sci. Rep., 4, 6485, 10.1038/srep06485

Handel, 2016, Assessing similarity to primary tissue and cortical layer identity in induced pluripotent stem cell-derived cortical neurons through single-cell transcriptomics, Hum. Mol. Genet, 25, 989, 10.1093/hmg/ddv637

Harris, 2015, Molecular organization of CA1 interneuron classes, bioRxiv, 34595, 10.1101/034595

Hartuv, 2000, A clustering algorithm based on graph connectivity, Inf. Process. Lett., 76, 175, 10.1016/S0020-0190(00)00142-3

Hou, 2016, Single-Cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas, Cell Res., 26, 304, 10.1038/cr.2016.23

Ilicic, 2016, Classification of low quality cells from single-cell RNA-seq data, Genome Biol., 17, 29, 10.1186/s13059-016-0888-1

Islam, 2014, Quantitative single-Cell RNA-Seq with unique molecular identifiers, Nat. Methods, 11, 163, 10.1038/nmeth.2772

Jaitin, 2014, Massively parallel Single-Cell RNA-Seq for marker-free decomposition of tissues into cell types, Science, 343, 776, 10.1126/science.1247651

Ji, 2016, TSCAN: pseudo-time reconstruction and evaluation in Single-Cell RNA-Seq analysis, Nucl. Acids Res, 44, e117, 10.1093/nar/gkw430

Jiang, 2016, GiniClust: detecting rare cell types from single-cell gene expression data with gini index, Genome Biol., 17, 144, 10.1186/s13059-016-1010-4

Jiang, 2016, Quality control of Single-Cell RNA-seq by SinQC, Bioinformatics, 10.1093/bioinformatics/btw176

Johnson, 2007, Adjusting batch effects in microarray expression data using empirical bayes methods, Biostatistics, 8, 118, 10.1093/biostatistics/kxj037

Katayama, 2013, SAMstrt: statistical test for differential expression in single-cell transcriptome with spike-in normalization, Bioinformatics, 29, 2943, 10.1093/bioinformatics/btt511

Katrib, 2016, Radiotranscriptomics: a synergy of imaging and transcriptomics in clinical assessment, Quant. Biol., 4, 1, 10.1007/s40484-016-0061-6

Kharchenko, 2014, Bayesian approach to single-cell differential expression analysis, Nat. Methods, 11, 740, 10.1038/nmeth.2967

Kim, 2015, HISAT: a fast spliced aligner with low memory requirements, Nat. Methods, 12, 357, 10.1038/nmeth.3317

Kim, 2013, TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions, Genome Biol., 14, R36, 10.1186/gb-2013-14-4-r36

Kim, 2015, Single-Cell mRNA sequencing identifies subclonal heterogeneity in anti-cancer drug responses of lung adenocarcinoma cells, Genome Biol., 16, 127, 10.1186/s13059-015-0692-3

Kimmerling, 2016, A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages, Nat. Commun., 7, 10220, 10.1038/ncomms10220

Kumar, 2014, Deconstructing transcriptional heterogeneity in pluripotent stem cells, Nature, 516, 56, 10.1038/nature13920

Kvastad, 2015, Single cell analysis of cancer cells using an improved RT-MLPA method has potential for cancer diagnosis and monitoring, Sci. Rep., 5, 16519, 10.1038/srep16519

Leek, 2014, Svaseq: removing batch effects and other unwanted noise from sequencing data, Nucleic Acids Res, 42, 10.1093/nar/gku864

Leng, 2016, OEFinder: a user interface to identify and visualize ordering effects in single-cell RNA-seq data, Bioinformatics, 32, 1408, 10.1093/bioinformatics/btw004

Leng, 2015, Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments, Nat. Methods, 12, 947, 10.1038/nmeth.3549

Levine, 2015, Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis, Cell, 162, 184, 10.1016/j.cell.2015.05.047

Li, 2011, RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome, BMC Bioinformatics, 12, 323, 10.1186/1471-2105-12-323

Li, 2009, The sequence alignment/map format and SAMtools, Bioinformatics, 25, 2078, 10.1093/bioinformatics/btp352

Li, 2013, Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-seq data, Stat. Methods Med. Res., 22, 519, 10.1177/0962280211428386

Liao, 2013, featurecounts: an efficient general purpose program for assigning sequence reads to genomic features, Bioinformatics, 10.1093/bioinformatics/btt656

Lohr, 2014, Whole exome sequencing of circulating tumor cells provides a window into metastatic prostate cancer, Nat. Biotechnol., 32, 479, 10.1038/nbt.2892

Love, 2014, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, Genome Biol., 15, 1, 10.1101/002832

Macaulay, 2015, G&T-Seq: parallel sequencing of single-cell genomes and transcriptomes, Nat. Methods, 12, 519, 10.1038/nmeth.3370

Marco, 2014, Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape, Proc. Natl. Acad. Sci., 111, E5643, 10.1073/pnas.1408993111

Martin, 2011, Cutadapt removes adapter sequences from high-throughput sequencing reads, EMBnet. J., 17, 10, 10.14806/ej.17.1.200

Meyer, 2016, Dnmt3a haploinsufficiency transforms Flt3-ITD myeloproliferative disease into a rapid, spontaneous, and fully-penetrant acute myeloid leukemia, Cancer Discov, 6, 501, 10.1158/2159-8290.CD-16-0008

Miyamoto, 2015, RNA-seq of single prostate CTCs implicates noncanonical wnt signaling in antiandrogen resistance, Science, 349, 1351, 10.1126/science.aab0917

Moignard, 2015, Decoding the regulatory network of early blood development from single-cell gene expression measurements, Nat. Biotechnol., 33, 269, 10.1038/nbt.3154

Navin, 2011, Tumour evolution inferred by single-cell sequencing, Nature, 472, 90, 10.1038/nature09807

Ntranos, 2016, Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts, bioRxiv, 17, 112, 10.1186/s13059-016-0970-8

Patel, 2014, Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma, Science, 344, 1396, 10.1126/science.1254257

Petropoulos, 2016, Single-cell RNA-seq reveals lineage and x chromosome dynamics in human preimplantation embryos, Cell, 165, 1012, 10.1016/j.cell.2016.03.023

Pettit, 2014, Identifying cell types from spatially referenced single-cell expression datasets, PLoS Comput Biol, 10, e1003824, 10.1371/journal.pcbi.1003824

Pierson, 2015, ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis, Genome Biol., 16, 1, 10.1186/s13059-015-0805-z

Pollen, 2014, Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex, Nat. Biotechnol., 32, 1053, 10.1038/nbt.2967

Prabhakaran, 2016, Dirichlet process mixture model for correcting technical variation in single-cell gene expression data, Proceedings of The 33rd International Conference on Machine Learning, 1070

Ramsköld, 2012, Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells, Nat. Biotechnol., 30, 777, 10.1038/nbt.2282

Robinson, 2010, edgeR: a bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, 26, 139, 10.1093/bioinformatics/btp616

Rotem, 2015, Single-Cell ChIP-seq reveals cell subpopulations defined by chromatin state, Nat. Biotechnol., 33, 1165, 10.1038/nbt.3383

Satija, 2015, Spatial reconstruction of single-cell gene expression data, Nat. Biotechnol., 33, 495, 10.1038/nbt.3192

Schurch, 2016, How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?, RNA, 22, 839, 10.1261/rna.053959.115

Shekhar, 2014, Automatic classification of cellular expression by nonlinear stochastic embedding (ACCENSE), Proc. Natl. Acad. Sci.U.S.A., 111, 202, 10.1073/pnas.1321405111

Shin, 2015, Single-Cell RNA-Seq with waterfall reveals molecular cascades underlying adult neurogenesis, Cell Stem Cell, 17, 360, 10.1016/j.stem.2015.07.013

Tang, 2010, Tracing the derivation of embryonic stem cells from the inner cell mass by single-cell RNA-seq analysis, Cell Stem Cell, 6, 468, 10.1016/j.stem.2010.03.015

Tenenbaum, 2000, A global geometric framework for nonlinear dimensionality reduction, Science, 290, 2319, 10.1126/science.290.5500.2319

Ting, 2014, Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells, Cell Rep., 8, 1905, 10.1016/j.celrep.2014.08.029

Tirosh, 2016, Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq, Science, 352, 189, 10.1126/science.aad0501

Trapnell, 2014, Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions, Nat. Biotechnol., 32, 381, 10.1038/nbt.2859

Trapnell, 2009, TopHat: discovering splice junctions with RNA-seq, Bioinformatics, 25, 1105, 10.1093/bioinformatics/btp120

Trapnell, 2010, Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation, Nat. Biotechnol., 28, 511, 10.1038/nbt.1621

Travers, 2015, Non-coding yet non-trivial: a review on the computational genomics of lincRNAs, BioData Min., 8, 44, 10.1186/s13040-015-0075-z

Treutlein, 2014, Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq, Nature, 509, 371, 10.1038/nature13173

Tsafrir, 2005, Sorting points into neighborhoods (SPIN): data analysis and visualization by ordering distance matrices, Bioinformatics, 21, 2301, 10.1093/bioinformatics/bti329

Vallejos, 2015, BASiCS: Bayesian analysis of single-cell sequencing data, PLoS Comput. Biol., 11, e1004333, 10.1371/journal.pcbi.1004333

Vu, 2016, Beta-poisson model for single-cell RNA-seq data analyses, Bioinformatics, 32, 2128, 10.1093/bioinformatics/btw202

Wang, 2016, Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning, bioRxiv., 52225, 10.1101/052225

Wang, 2012, Multiple graph regularized protein domain ranking, BMC Bioinformatics, 13, 307, 10.1186/1471-2105-13-307

Wang, 2010, MapSplice: accurate mapping of RNA-seq reads for splice junction discovery, Nucleic Acids Res., 38, e178, 10.1093/nar/gkq622

Welch, 2016, SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data, Genome Biol., 17, 106, 10.1186/s13059-016-0975-3

Wu, 2016, GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality, Stat. Genomics Methods Protoc, 1418, 283, 10.1007/978-1-4939-3578-9_15

Xu, 2015, Identification of cell types from single-cell transcriptomes using a novel clustering method, Bioinformatics, 31, 1974, 10.1093/bioinformatics/btv088

Yan, 2013, Single-cell RNA-seq profiling of human preimplantation embryos and embryonic stem cells, Nat. Struct. Mol. Biol., 20, 1131, 10.1038/nsmb.2660

Yang, 2013, HTQC: a fast quality control toolkit for illumina sequencing data, BMC Bioinformatics, 14, 33, 10.1186/1471-2105-14-33

Zeisel, 2015, Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq, Science, 347, 1138, 10.1126/science.aaa1934

Zhang, 2011, BIGpre: a quality assessment package for next-generation sequencing data, Genomics, Proteomics Bioinformatics, 9, 238, 10.1016/S1672-0229(11)60027-2

Zhu, 2016, Constructing 3D interaction maps from 1D epigenomes, Nat. Commun., 7, 10812, 10.1038/ncomms10812

Zurauskiene, 2015, pcaReduce: hierarchical clustering of single cell transcriptional profiles, bioRxiv., 26385, 10.1186/s12859-016-0984-y