Systematic identification of long noncoding RNAs expressed during zebrafish embryogenesis

Genome Research - Tập 22 Số 3 - Trang 577-591 - 2012
Andrea Pauli1, Eivind Valen2, Michael Lin3,4, Manuel Garber4, Nadine L. Vastenhouw5, Joshua Z. Levin4, Fan Lin4, Albin Sandelin2, John L. Rinn6,4, Aviv Regev7,3,4, Alexander F. Schier5,4
1Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138 USA
2The Bioinformatics Centre, Department of Biology and the Biotech, Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen DK-2200, Denmark;
3Massachusetts Institute of Technology, MIT, Cambridge, Massachusetts 02139, USA
4The Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA;
5Department of Molecular and Cellular Biology (MCB), Harvard University, Cambridge, Massachusetts 02138, USA;
6Department of Stem Cell and Regenerative Biology (SCRB), Harvard University, Cambridge, Massachusetts 02138, USA;
7Howard Hughes Medical Institute (HHMI), Chevy Chase, Maryland 20815, USA

Tóm tắt

Long noncoding RNAs (lncRNAs) comprise a diverse class of transcripts that structurally resemble mRNAs but do not encode proteins. Recent genome-wide studies in humans and the mouse have annotated lncRNAs expressed in cell lines and adult tissues, but a systematic analysis of lncRNAs expressed during vertebrate embryogenesis has been elusive. To identify lncRNAs with potential functions in vertebrate embryogenesis, we performed a time-series of RNA-seq experiments at eight stages during early zebrafish development. We reconstructed 56,535 high-confidence transcripts in 28,912 loci, recovering the vast majority of expressed RefSeq transcripts while identifying thousands of novel isoforms and expressed loci. We defined a stringent set of 1133 noncoding multi-exonic transcripts expressed during embryogenesis. These include long intergenic ncRNAs (lincRNAs), intronic overlapping lncRNAs, exonic antisense overlapping lncRNAs, and precursors for small RNAs (sRNAs). Zebrafish lncRNAs share many of the characteristics of their mammalian counterparts: relatively short length, low exon number, low expression, and conservation levels comparable to that of introns. Subsets of lncRNAs carry chromatin signatures characteristic of genes with developmental functions. The temporal expression profile of lncRNAs revealed two novel properties: lncRNAs are expressed in narrower time windows than are protein-coding genes and are specifically enriched in early-stage embryos. In addition, several lncRNAs show tissue-specific expression and distinct subcellular localization patterns. Integrative computational analyses associated individual lncRNAs with specific pathways and functions, ranging from cell cycle regulation to morphogenesis. Our study provides the first systematic identification of lncRNAs in a vertebrate embryo and forms the foundation for future genetic, genomic, and evolutionary studies.

Từ khóa


Tài liệu tham khảo

10.1101/gr.116012.110

10.1016/j.cell.2009.01.002

10.1242/dev.00799

10.1093/bioinformatics/bth088

10.1016/j.cell.2006.02.041

10.1126/science.1103388

10.1101/gad.17446611

10.1126/science.1112014

10.1126/science.1190809

10.1371/journal.pbio.1000384

10.1101/gr.078378.108

10.1038/nature09632

10.1142/9781848165632_0019

10.1038/nature05874

10.1038/nature07759

10.1093/nar/gkq1064

10.1038/nature07672

10.1038/nbt.1633

10.1038/nature10398

10.1016/j.cell.2010.06.040

10.1038/ng.848

10.1146/annurev.genet.35.102401.090756

10.1126/science.1068597

10.1126/science.1138341

10.1073/pnas.0904715106

10.1038/nature09033

10.1002/aja.1002030302

10.1016/0092-8674(88)90383-2

10.1016/j.gde.2010.03.003

10.1002/jcb.21756

10.1186/gb-2009-10-3-r25

10.1038/nmeth.1491

10.1093/bioinformatics/btr209

10.1126/science.277.5324.383

10.1101/gad.1416906

10.1073/pnas.0706729105

10.1038/ng1180

10.1126/science.1163802

10.1038/nature01266

10.1016/j.cell.2010.09.001

10.1016/j.molcel.2008.08.022

10.1093/nar/gkp596

10.1038/nrg2904

10.1038/nature09144

10.1101/gr.6036807

10.1371/journal.pgen.1000617

10.1101/gad.1055203

10.1016/j.cell.2007.05.022

10.1101/gad.590910

10.1038/415810a

10.1073/pnas.0506580102

10.1038/nprot.2007.514

10.1016/j.cell.2010.09.049

10.1093/bioinformatics/btp120

10.1038/nbt.1621

10.1016/j.molcel.2010.08.011

10.1126/science.1192002

10.1038/nature08866

10.1186/1471-213X-11-30

10.1016/j.molcel.2011.08.018

10.1038/nature09819

10.1016/j.cell.2008.12.023

10.1016/j.cell.2008.10.012

10.1126/science.1163045

10.1016/j.molcel.2010.12.011

10.1038/nrg2905