CGPS: A machine learning-based approach integrating multiple gene set analysis tools for better prioritization of biologically relevant pathways

Journal of Genetics and Genomics - Tập 45 - Trang 489-504 - 2018
Chen Ai1, Lei Kong1
1Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China

Tài liệu tham khảo

Akers, 2010, VE-cadherin and PECAM-1 enhance ALL migration across brain microvascular endothelial cell monolayers, Exp. Hematol., 38, 733, 10.1016/j.exphem.2010.05.001 Alhamdoosh, 2017, Combining multiple tools outperforms individual methods in gene set enrichment analyses, Bioinformatics, 33, 414, 10.1093/bioinformatics/btw623 Anguille, 2011, Interferon-α in acute myeloid leukemia: an old drug revisited, Leukemia, 25, 739, 10.1038/leu.2010.324 Atadja, 2009, Development of the pan-DAC inhibitor panobinostat (LBH589): successes and challenges, Cancer Lett., 280, 233, 10.1016/j.canlet.2009.02.019 Barry, 2005, Significance analysis of functional categories in gene expression studies: a structured permutation approach, Bioinforma. Oxf. Engl., 21, 1943, 10.1093/bioinformatics/bti260 Bayerlová, 2015, Comparative study on gene set and pathway topology-based enrichment methods, BMC Bioinformatics, 16, 334, 10.1186/s12859-015-0751-5 Bernhard, 2001, Inhibition of histone deacetylase activity enhances Fas receptor-mediated apoptosis in leukemic lymphoblasts, Cell Death Differ., 8, 1014, 10.1038/sj.cdd.4400914 Bolden, 2006, Anticancer activities of histone deacetylase inhibitors, Nat. Rev. Drug Discov., 5, 769, 10.1038/nrd2133 Buchwald, 2009, HDACi--targets beyond chromatin, Cancer Lett., 280, 160, 10.1016/j.canlet.2009.02.028 Chiaretti, 2004, Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival, Blood, 103, 2771, 10.1182/blood-2003-09-3243 Chiron, 2008, Toll-like receptors: lessons to learn from normal and malignant human B cells, Blood, 112, 2205, 10.1182/blood-2008-02-140673 Croft, 2011, Reactome: a database of reactions, pathways and biological processes, Nucleic Acids Res., 39, D691, 10.1093/nar/gkq1018 Desouza, 2012, The actin cytoskeleton as a sensor and mediator of apoptosis, BioArchitecture, 2, 75, 10.4161/bioa.20975 Dong, 2016, LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights, Sci. Rep., 6 Edgar, 2002, Gene Expression Omnibus: NCBI gene expression and hybridization array data repository, Nucleic Acids Res., 30, 207, 10.1093/nar/30.1.207 Efron, 2007, On testing the significance of sets of genes, Ann. Appl. Stat., 1, 107, 10.1214/07-AOAS101 Fang, 2012, MicroRNA-143 (miR-143) regulates cancer glycolysis via targeting hexokinase 2 gene, J. Biol. Chem., 287, 23227, 10.1074/jbc.M112.373084 Fang, 2012, A network-based gene-weighting approach for pathway analysis, Cell Res., 22, 565, 10.1038/cr.2011.149 Fogg, 2014, Class IIa histone deacetylases are conserved regulators of circadian function, J. Biol. Chem., 289, 34341, 10.1074/jbc.M114.606392 Fumarola, 2014, Targeting PI3K/AKT/mTOR pathway in non small cell lung cancer, Biochem. Pharmacol., 90, 197, 10.1016/j.bcp.2014.05.011 Gaarenstroom, 2014, TGF-β signaling to chromatin: how Smads regulate transcription during self-renewal and differentiation, Semin. Cell Dev. Biol., 32, 107, 10.1016/j.semcdb.2014.01.009 Geistlinger, 2016, Bioconductor's EnrichmentBrowser: seamless navigation through combined results of set- & network-based enrichment analysis, BMC Bioinformatics, 17, 45, 10.1186/s12859-016-0884-1 Glenisson, 2007, Histone deacetylase 4 is required for TGFβ1-induced myofibroblastic differentiation, Biochim. Biophys. Acta BBA - Mol. Cell Res., 1773, 1572, 10.1016/j.bbamcr.2007.05.016 Goeman, 2007, Analyzing gene expression data in terms of gene sets: methodological issues, Bioinforma. Oxf. Engl., 23, 980, 10.1093/bioinformatics/btm051 Goeman, 2004, A global test for groups of genes: testing association with a clinical outcome, Bioinforma. Oxf. Engl., 20, 93, 10.1093/bioinformatics/btg382 Gu, 2013, CePa: an R package for finding significant pathways weighted by multiple network centralities, Bioinforma. Oxf. Engl., 29, 658, 10.1093/bioinformatics/btt008 Gumy-Pause, 2004, ATM gene and lymphoid malignancies, Leukemia, 18, 238, 10.1038/sj.leu.2403221 Hänzelmann, 2013, GSVA: gene set variation analysis for microarray and RNA-Seq data, BMC Bioinformatics, 14, 7, 10.1186/1471-2105-14-7 Kanehisa, 2000, KEGG: kyoto encyclopedia of genes and genomes, Nucleic Acids Res., 28, 27, 10.1093/nar/28.1.27 Kanehisa, 2010, KEGG for representation and analysis of molecular networks involving diseases and drugs, Nucleic Acids Res., 38, D355, 10.1093/nar/gkp896 Khatri, 2012, Ten years of pathway analysis: current approaches and outstanding challenges, PLoS Comput. Biol., 8, 10.1371/journal.pcbi.1002375 Law, 2014, voom: precision weights unlock linear model analysis tools for RNA-seq read counts, Genome Biol., 15, R29, 10.1186/gb-2014-15-2-r29 Liu, 2013, Blocking the class I histone deacetylase ameliorates renal fibrosis and inhibits renal fibroblast activation via modulating TGF-beta and EGFR signaling, PLoS One, 8 Livrea, 1985, Acute changes in blood-CSF barrier permselectivity to serum proteins after intrathecal methotrexate and CNS irradiation, J. Neurol., 231, 336 Luciano, 2014, Kidney involvement in leukemia and lymphoma, Adv. Chron. Kidney Dis., 21, 27, 10.1053/j.ackd.2013.07.004 Luo, 2009, GAGE: generally applicable gene set enrichment for pathway analysis, BMC Bioinformatics, 10, 161, 10.1186/1471-2105-10-161 Mayerhofer, 2004, Identification of heme oxygenase-1 as a novel BCR/ABL-dependent survival factor in chronic myeloid leukemia, Cancer Res., 64, 3148, 10.1158/0008-5472.CAN-03-1200 Parkinson, 2007, ArrayExpress--a public database of microarray experiments and gene expression profiles, Nucleic Acids Res., 35, D747, 10.1093/nar/gkl995 Patel, 2009, A dyad of lymphoblastic lysosomal cysteine proteases degrades the antileukemic drug l-asparaginase, J. Clin. Invest., 119, 1964 Pitt, 2015, CXCL12-producing vascular endothelial niches control acute T cell leukemia maintenance, Cancer Cell, 27, 755, 10.1016/j.ccell.2015.05.002 Rahmatallah, 2016, Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline, Briefings Bioinf., 17, 393, 10.1093/bib/bbv069 Ranganathan, 2014, Guidance cue Netrin-1 and the regulation of inflammation in acute and chronic kidney disease, Mediat. Inflamm., 2014, 10.1155/2014/525891 Rasheed, 2008, Histone deacetylase inhibitors in lymphoma and solid malignancies, Expert Rev. Anticancer Ther., 8, 413, 10.1586/14737140.8.3.413 Robinson, 2010, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, 26, 139, 10.1093/bioinformatics/btp616 Schaefer, 2009, PID: the pathway interaction database, Nucleic Acids Res., 37, D674, 10.1093/nar/gkn653 Siegel, 2003, Cytostatic and apoptotic actions of TGF-β in homeostasis and cancer, Nat. Rev. Cancer, 3, 807, 10.1038/nrc1208 Smyth, 2005, Limma: linear models for microarray data, 397 Staal, 2008, Signaling pathways involved in the development of T-cell acute lymphoblastic leukemia, Haematologica, 93, 493, 10.3324/haematol.12917 Subramanian, 2005, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proc. Natl. Acad. Sci. U. S. A., 102, 15545, 10.1073/pnas.0506580102 Takahashi, 2017, Rituximab protects podocytes and exerts anti-proteinuric effects in rat adriamycin-induced nephropathy independent of B-lymphocytes, Nephrol. Carlton Vic., 22, 49, 10.1111/nep.12737 Tarca, 2013, A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity, PLoS One, 8, 10.1371/journal.pone.0079217 Tarca, 2012, Down-weighting overlapping genes improves gene set analysis, BMC Bioinformatics, 13, 136, 10.1186/1471-2105-13-136 The Cancer Genome Atlas Research Network, 2013, The cancer genome Atlas Pan-cancer analysis project, Nat. Genet., 45, 1113, 10.1038/ng.2764 Tomfohr, 2005, Pathway level analysis of gene expression using singular value decomposition, BMC Bioinformatics, 6, 225, 10.1186/1471-2105-6-225 Tripathi, 2012, Assessment method for a power analysis to identify differentially expressed pathways, PLoS One, 7, 10.1371/journal.pone.0037510 Van de Wetering, 2002, WNT signaling and lymphocyte development, Cell, 109, S13, 10.1016/S0092-8674(02)00709-2 Visani, 2000, Alpha-interferon improves survival and remission duration in P-190BCR-ABL positive adult acute lymphoblastic leukemia, Leukemia, 14, 22, 10.1038/sj.leu.2401641 Wahaib, 2016, Panobinostat: a histone deacetylase inhibitor for the treatment of relapsed or refractory multiple myeloma, Am. J. Health-Syst. Pharm. AJHP Off. J. Am. Soc. Health-Syst. Pharm., 73, 441, 10.2146/ajhp150487 Yetgin, 2004, Evaluation of kidney damage in patients with acute lymphoblastic leukemia in long-term follow-up: value of renal scan, Am. J. Hematol., 77, 132, 10.1002/ajh.20146 Zhang, 2009, KEGGgraph: a graph approach to KEGG PATHWAY in R and bioconductor, Bioinformatics, 25, 1470, 10.1093/bioinformatics/btp167