Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data

Genome Biology - Tập 21 - Trang 1-14 - 2020
Qian Qin1,2, Jingyu Fan1, Rongbin Zheng1, Changxin Wan1, Shenglin Mei1, Qiu Wu1, Hanfei Sun1, Myles Brown3,4, Jing Zhang5, Clifford A. Meyer6,4, X. Shirley Liu6,4
1Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, China
2Center of Molecular Medicine, Children’s Hospital of Fudan University, Shanghai, China
3Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
4Department of Data Sciences, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, USA
5Stem Cell Translational Research Center, Tongji Hospital, School of Life Science and Technology, Tongji University, Shanghai, China
6Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, USA

Tóm tắt

We developed Lisa ( http://lisa.cistrome.org/ ) to predict the transcriptional regulators (TRs) of differentially expressed or co-expressed gene sets. Based on the input gene sets, Lisa first uses histone mark ChIP-seq and chromatin accessibility profiles to construct a chromatin model related to the regulation of these genes. Using TR ChIP-seq peaks or imputed TR binding sites, Lisa probes the chromatin models using in silico deletion to find the most relevant TRs. Applied to gene sets derived from targeted TF perturbation experiments, Lisa boosted the performance of imputed TR cistromes and outperformed alternative methods in identifying the perturbed TRs.

Tài liệu tham khảo

Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–74. Takahashi K, Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006;126:663–76. Thurman RE, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489:75–82. Gerstein MB, et al. Architecture of the human regulatory network derived from ENCODE data. Nature. 2012;488:91–100. Creyghton MP, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. U. S. A. 2010;107:21931–6. Heinz S, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell. 2010;38:576–89. He HH, et al. Nucleosome dynamics define transcriptional enhancers. Nat. Genet. 2010;42:343–7. Mikkelsen TS, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–60. Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007;316:1497–502. Lambert SA, et al. The human transcription factors. Cell. 2018;172:650–65. Fulton DL, et al. TFCat: the curated catalog of mouse and human transcription factors. Genome Biol. 2009;10:R29. Mei S, et al. Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 2017;45:D658–62. ENCODE. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. Savic D, et al. CETCh-seq: CRISPR epitope tagging ChIP-seq of DNA-binding proteins. Genome Res. 2015;25:1581–9. Skene PJ, Henikoff S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. Elife. 2017. https://doi.org/10.7554/eLife.21856. Hesselberth JR, et al. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nature Methods. 2009;6:283–9. Boyle AP, et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res. 2011;21:456–64. Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods. 2013:1–8. https://doi.org/10.1038/nmeth.2688. Rada-Iglesias A, et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature. 2011;470:279–83. Meyer CA, He HH, Brown M, Liu XS. BINOCh: Binding inference from nucleosome occupancy changes. Bioinformatics. 2011;27:1867–8. He HH, et al. Differential DNase I hypersensitivity reveals factor-dependent chromatin dynamics. Genome Res. 2012;22:1015–25. Keilwagen J, Posch S, Grau J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol. 2019;20:1–17. Schreiber J, Bilmes J, Noble WS. Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples; 2019. p. 1–20. Qin Q, Feng J. Imputation for transcription factor binding predictions based on deep learning. PLOS Comput. Biol. 2017;13:e1005403. Li H, Quang D, Guan Y. Anchor: trans-cell type prediction of transcription factor binding sites; 2019. p. 281–92. https://doi.org/10.1101/gr.237156.118.29. Karimzadeh M, Hoffman MM. Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome; 2018. Quang D, Xie X. FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data; 2017. p. 1–27. Wang S, et al. Modeling cis-regulation with a compendium of genome-wide histone H3K27ac profiles. Genome Res. 2016;26:1417–29. Wang, Z. et al. BART: a transcription factor prediction tool with query gene sets or epigenetic profiles. Bioinformatics 0–2 (2018). doi:https://doi.org/10.1093/bioinformatics/bty194 Imrichova H, Hulselmans G, Atak ZK, Potier D, Aerts S. I-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly. Nucleic Acids Res. 2015;43:W57–64. Kuleshov MV, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–7. Long HK, Prescott SL, Wysocka J. Ever-changing landscapes: transcriptional enhancers in development and evolution. Cell. 2016;167:1170–87. Osterwalder M, et al. Enhancer redundancy provides phenotypic robustness in mammalian development. Nature. 2018. https://doi.org/10.1038/nature25461. Fukaya T, Lim B, Levine M. Enhancer control of transcriptional bursting. Cell. 2016;166:358–68. Ouyang Z, Zhou Q, Hung W. ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells; 2009. Wang S, et al. Target analysis by integration of transcriptome and ChIP-seq data with BETA. Nat. Protoc. 2013;8:2502–15. Sikora-Wohlfeld W, Ackermann M, Christodoulou EG, Singaravelu K, Beyer A. Assessing computational methods for transcription factor target gene identification based on ChIP-seq data. PLoS Comput. Biol. 2013;9:e1003342. Liu Y, Xie J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures; 2018. Chia NY, et al. Regulatory crosstalk between lineage-survival oncogenes KLF5, GATA4 and GATA6 cooperatively promotes gastric cancer development. Gut. 2015. https://doi.org/10.1136/gutjnl-2013-306596. Yang XZ, et al. LINC01133 as ceRNA inhibits gastric cancer progression by sponging miR-106a-3p to regulate APC expression and the Wnt/β-catenin pathway. Mol. Cancer. 2018. https://doi.org/10.1186/s12943-018-0874-1. Hwang JTK, Kelly GM. GATA6 and FOXA2 regulate Wnt6 expression during extraembryonic endoderm formation. Stem Cells Dev. 2012;21:3220–32. Weidenfeld J, Shu W, Zhang L, Millar SE, Morrisey EE. The WNT7b promoter is regulated by TTF-1, GATA6, and Foxa2 in lung epithelium. J. Biol. Chem. 2002;277:21061–70. Muzikar KA, Nickols NG, Dervan PB. Repression of DNA-binding dependent glucocorticoid receptor-mediated gene expression; 2009. p. 2009. Alvarez MJ, et al. Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat. Genet. 2016. https://doi.org/10.1038/ng.3593. Fiaschetti G, et al. Bone morphogenetic protein-7 is a MYC target with prosurvival functions in childhood medulloblastoma. Oncogene. 2011. https://doi.org/10.1038/onc.2011.10. Vencken SF, et al. An integrated analysis of the SOX2 microRNA response program in human pluripotent and nullipotent stem cell lines. BMC Genomics. 2014. https://doi.org/10.1186/1471-2164-15-711. Ci W, et al. The BCL6 transcriptional program features repression of multiple oncogenes in primary B cells and is deregulated in DLBCL. Blood. 2009. https://doi.org/10.1182/blood-2008-12-193037. Parekh S, et al. BCL6 programs lymphoma cells for survival and differentiation through distinct biochemical mechanisms. Blood. 2007;110:2067–74. Huynh KD, Bardwell VJ. The BCL-6 POZ domain and other POZ domains interact with the co-repressors N-CoR and SMRT. Oncogene. 1998;17:2473–84. Cui J, et al. FBI-1 functions as a novel AR co-repressor in prostate cancer cells. Cell. Mol. Life Sci. 2011. https://doi.org/10.1007/s00018-010-0511-7. Wei, F., Zaprazna, K., Wang, J. & Atchison, M. L. PU.1 Can Recruit BCL6 to DNA to repress gene expression in germinal center B cells. Mol. Cell. Biol. 29, 4612–4622 (2009). Huynh KD, Fischle W, Verdin E, Bardwell VJ. BCoR, a novel corepressor involved in BCL-6 repression. Genes Dev. 2000. https://doi.org/10.1111/j.1754-7121.1984.tb00653.x. Grandori C, Cowley SM, James LP, Eisenman RN. The Myc/Max/Mad network and the transcriptional control of cell behavior. Annu. Rev. Cell Dev. Biol. 2000. https://doi.org/10.1146/annurev.cellbio.16.1.653. Tzatsos A, et al. KDM2B promotes pancreatic cancer via Polycomb-dependent and -independent transcriptional programs. J. Clin. Invest. 2013. https://doi.org/10.1172/JCI64535. Andoniadou CL, et al. Sox2+stem/progenitor cells in the adult mouse pituitary support organ homeostasis and have tumor-inducing potential. Cell Stem Cell. 2013;13:433–45. Friedman JR, Kaestner KH. The Foxa family of transcription factors in development and metabolism. Cell Molr Life Sci. 2006. https://doi.org/10.1007/s00018-006-6095-6. Chen T, et al. Foxa1 contributes to the repression of Nanog expression by recruiting Grg3 during the differentiation of pluripotent P19 embryonal carcinoma cells; 2014. p. 6. Hagey DW, et al. SOX2 regulates common and specific stem cell features in the CNS and endoderm derived organs. PLoS Genet. 2018. https://doi.org/10.1371/journal.pgen.1007224. Teo AKK, et al. Pluripotency factors regulate definitive endoderm specification through eomesodermin. Genes Dev. 2011. https://doi.org/10.1101/gad.607311. Segal E, et al. Module networks: identify regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 2003;34:166–76. Carroll JS, Prall OWJ, Musgrove EA, Sutherland RL. A pure estrogen antagonist inhibits cyclin E-Cdk2 activity in MCF-7 breast cancer cells and induces accumulation of p130-E2F4 complexes characteristic of quiescence. J. Biol. Chem. 2000;275:38221–9. Li D, Hsu S, Purushotham D, Sears RL, Wang T. WashU Epigenome Browser update 2019. Nucleic Acids Res. 2019. https://doi.org/10.1093/nar/gkz348. Shang Y, Hu X, DiRenzo J, Lazar MA, Brown M. Cofactor Dynamics and sufficiency in estrogen receptor–regulated transcription. Cell. 2000;103:843–52. Vockley CM, et al. Direct GR binding sites potentiate clusters of TF binding across the human genome. Cell. 2016;166:1269–81.e19. Crow M, Lim N, Ballouz S, Pavlidis P, Gillis J. Predictability of human differential gene expression. Proc. Natl. Acad. Sci. U. S. A. 2019;116:6491–500. Muhar M, et al. SLAM-seq defines direct gene-regulatory functions of the BRD4-MYC axis. Science (80-. ). 2018;360:800–5. Aibar S, et al. SCENIC: single-cell regulatory network inference and clustering; 2017. p. 14. Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics. 2010;26:2204–7. Matys V, et al. TRANSFAC®: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 2003;31:374–8. Mathelier A. et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding. Nucleic Acids Res. 2016;44(D1):110–5. Liu T, et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 2011;12:R83. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2. Köster J, Rahmann S. Snakemake--a scalable bioinformatics workflow engine. Bioinformatics. 2012;28:2520–2. Qin Q, et al. Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data. Github. 2019; https://github.com/liulab-dfci/lisa. Qin Q, et al. Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data. Zenodo. 2019; https://zenodo.org/record/3583466#.XhjmQlVKhaQ.