ROP: dumpster diving in RNA-sequencing to find the source of 1 trillion reads across diverse adult human tissues

Genome Biology - Tập 19 - Trang 1-12 - 2018
Serghei Mangul1,2, Harry Taegyun Yang1, Nicolas Strauli3, Franziska Gruhl4,5, Hagit T. Porath6, Kevin Hsieh1, Linus Chen7, Timothy Daley8, Stephanie Christenson9, Agata Wesolowska-Andersen10, Roberto Spreafico2, Cydney Rios10, Celeste Eng11, Andrew D. Smith8, Ryan D. Hernandez12,13,14, Roel A. Ophoff15,16,17, Jose Rodriguez Santana18, Erez Y. Levanon19, Prescott G. Woodruff9, Esteban Burchard20, Max A. Seibold21,22, Sagiv Shifman23, Eleazar Eskin1,16, Noah Zaitlen9
1Department of Computer Science, University of California, Los Angeles, USA
2Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, USA
3Biomedical Sciences Graduate Program, University of California, San Francisco, USA
4Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
5SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
6The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
7Department of Bioengineering, University of California, Los Angeles, USA
8Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, USA
9Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, and Cardiovascular Research Institute, University of California, San Francisco, USA
10Center for Genes, Environment, and Health, National Jewish Health, Denver, USA
11Department of Medicine, University of California, San Francisco, USA
12Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, USA
13Institute for Quantitative Biosciences, University of California, San Francisco, USA
14Institute for Human Genetics, University of California San Francisco, San Francisco, USA
15Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University California, Los Angeles, USA
16Department of Human Genetics, University of California, Los Angeles, USA
17Department of Psychiatry, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
18Centro de Neumología Pediátrica, San Juan, Puerto Rico
19The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
20Schools of Pharmacy and Medicine, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, USA
21Department of Pediatrics, National Jewish Health, Denver, USA
22University of Colorado School of Medicine, Denver, USA
23Department of Genetics, The Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel

Tóm tắt

High-throughput RNA-sequencing (RNA-seq) technologies provide an unprecedented opportunity to explore the individual transcriptome. Unmapped reads are a large and often overlooked output of standard RNA-seq analyses. Here, we present Read Origin Protocol (ROP), a tool for discovering the source of all reads originating from complex RNA molecules. We apply ROP to samples across 2630 individuals from 54 diverse human tissues. Our approach can account for 99.9% of 1 trillion reads of various read length. Additionally, we use ROP to investigate the functional mechanisms underlying connections between the immune system, microbiome, and disease. ROP is freely available at https://github.com/smangul1/rop/wiki .

Tài liệu tham khảo