Statistical genomics in rare cancer
Tài liệu tham khảo
Keat, 2013, International rare cancers initiative, Lancet Oncol., 14, 109, 10.1016/S1470-2045(12)70570-3
DeSantis, 2017, The burden of rare cancers in the United States, CA Cancer J. Clin., 67, 261, 10.3322/caac.21400
Gatta, 2011, Rare cancers are not so rare: the rare cancer burden in Europe, Eur. J. Cancer, 47, 2493, 10.1016/j.ejca.2011.08.008
Edgar, 2002, Gene Expression Omnibus: NCBI gene expression and hybridization array data repository, Nucleic Acids Res., 30, 207, 10.1093/nar/30.1.207
Barrett, 2013, NCBI GEO: archive for functional genomics data sets--update, Nucleic Acids Res., 41, D991, 10.1093/nar/gks1193
Cancer Genome Atlas Research, 2013, The cancer genome atlas pan-cancer analysis project, Nat. Genet., 45, 1113, 10.1038/ng.2764
Zheng, 2016, Comprehensive pan-genomic characterization of adrenocortical carcinoma, Cancer Cell, 29, 723, 10.1016/j.ccell.2016.04.002
Farshidfar, 2017, Integrative genomic analysis of cholangiocarcinoma identifies distinct IDH-Mutant molecular profiles, Cell Rep., 18, 2780, 10.1016/j.celrep.2017.02.033
Hoadley, 2018, Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of Cancer, Cell, 173, 10.1016/j.cell.2018.03.022
Hmeljak, 2018, Integrative molecular characterization of malignant pleural mesothelioma, Cancer Discov., 8, 1548, 10.1158/2159-8290.CD-18-0804
Fishbein, 2017, Comprehensive molecular characterization of pheochromocytoma and paraganglioma, Cancer Cell, 31, 181, 10.1016/j.ccell.2017.01.001
Cancer Genome Atlas Research Network, 2017, Comprehensive and integrated genomic characterization of adult soft tissue sarcomas, Cell, 171, e928
Shen, 2018, Integrated molecular characterization of testicular germ cell tumors, Cell Rep., 23, 3392, 10.1016/j.celrep.2018.05.039
Cherniack, 2017, Integrated molecular characterization of uterine carcinosarcoma, Cancer Cell, 31, 411, 10.1016/j.ccell.2017.02.010
Robertson, 2017, Integrative analysis identifies four molecular and clinical subsets in uveal melanoma, Cancer Cell, 32, e215
Liu, 2017, The genomic landscape of pediatric and young adult T-lineage acute lymphoblastic leukemia, Nat. Genet., 49, 1211, 10.1038/ng.3909
Bolouri, 2018, The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactions, Nat. Med., 24, 103, 10.1038/nm.4439
Pugh, 2013, The genetic landscape of high-risk neuroblastoma, Nat. Genet., 45, 279, 10.1038/ng.2529
Armstrong, 2018, A unique subset of low-risk Wilms tumors is characterized by loss of function of TRIM28 (KAP1), a gene critical in early renal development: a children’s oncology group study, PLoS One, 13, 10.1371/journal.pone.0208936
Blay, 2016, The value of research collaborations and consortia in rare cancers, Lancet Oncol., 17, e62, 10.1016/S1470-2045(15)00388-5
Ovarian Cancer Association Consortium, 2015, No clinical utility of KRAS variant rs61764370 for ovarian or breast cancer, Gynecol. Oncol.
Phelan, 2017, Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer, Nat. Genet., 49, 680, 10.1038/ng.3826
Easton, 2007, Genome-wide association study identifies novel breast cancer susceptibility loci, Nature, 447, 1087, 10.1038/nature05887
Zhang, 2011, International cancer genome consortium data portal--a one-stop shop for cancer genomics data, Database, 2011, 10.1093/database/bar026
Zhang, 2019, The international cancer genome consortium data portal, Nat. Biotechnol., 37, 367, 10.1038/s41587-019-0055-9
Varley, 1997, Germ-line mutations of TP53 in Li-Fraumeni families: an extended study of 39 families, Cancer Res., 57, 3245
Eng, 1997, Third international workshop on collaborative interdisciplinary studies of p53 and other predisposing genes in Li-Fraumeni syndrome, Cancer Epidemiol. Biomarkers Prev., 6, 379
Johnson, 2007, Adjusting batch effects in microarray expression data using empirical Bayes methods, Biostatistics, 8, 118, 10.1093/biostatistics/kxj037
Abbas-Aghababazadeh, 2018, Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing, PLoS One, 13, 10.1371/journal.pone.0206312
Price, 2010, New approaches to population stratification in genome-wide association studies, Nature reviews, 11, 459, 10.1038/nrg2813
Deb, 2014, Mutational profiling of familial male breast cancers reveals similarities with luminal A female breast cancer with rare TP53 mutations, Br. J. Cancer, 111, 2351, 10.1038/bjc.2014.511
Weiss, 2005, Epidemiology of male breast cancer, Cancer Epidemiol. Biomarkers Prev., 14, 20, 10.1158/1055-9965.20.14.1
Korde, 2010, Multidisciplinary meeting on male breast cancer: summary and research recommendations, J. Clin. Oncol., 28, 2114, 10.1200/JCO.2009.25.5729
Harlan, 2010, Breast cancer in men in the United States: a population-based study of diagnosis, treatment, and survival, Cancer, 116, 3558, 10.1002/cncr.25153
Giordano, 2005, A review of the diagnosis and management of male breast cancer, Oncologist, 10, 471, 10.1634/theoncologist.10-7-471
Chang, 2013, Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline, BMC Bioinformatics, 14, 368, 10.1186/1471-2105-14-368
Wang, 2013, Comparing methods for performing trans-ethnic meta-analysis of genome-wide association studies, Hum. Mol. Genet., 22, 2303, 10.1093/hmg/ddt064
Ramasamy, 2008, Key issues in conducting a meta-analysis of gene expression microarray datasets, PLoS Med., 5, e184, 10.1371/journal.pmed.0050184
Thompson, 2011, The meta-analysis of genome-wide association studies, Brief Bioinform, 12, 259, 10.1093/bib/bbr020
Mo, 2018, Prognostic power of a tumor differentiation gene signature for bladder urothelial carcinomas, J. Natl. Cancer Inst., 110, 448, 10.1093/jnci/djx243
Richardson, 2016, Statistical methods in integrative genomics, Annu. Rev. Stat. Appl., 3, 181, 10.1146/annurev-statistics-041715-033506
Tseng, 2012, Comprehensive literature review and statistical considerations for microarray meta-analysis, Nucleic Acids Res., 40, 3785, 10.1093/nar/gkr1265
M. Borenstein, L.V. Hedges, J. Higgins, Rothstein, Introduction to Meta-Analysis, (Chichester, UK), (2009).
Rhodes, 2002, Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer, Cancer Res., 62, 4427
Fisher, 1932
Stouffer, 1949, The American soldier, Vol 1
van Zwet, 1967, On the combination of independent test statistics, Ann. Math. Stat., 38, 659, 10.1214/aoms/1177698861
Won, 2009, Choosing an optimal method to combine P-values, Stat. Med., 28, 1537, 10.1002/sim.3569
Tippett, 1931
Li, 2011, An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies, Ann. Appl. Stat., 5, 994, 10.1214/10-AOAS393
Barton, 2013, Correction of unexpected distributions of P values from analysis of whole genome arrays by rectifying violation of statistical assumptions, BMC Genomics, 14, 161, 10.1186/1471-2164-14-161
Fodor, 2007, Towards the uniform distribution of null P values on Affymetrix microarrays, Genome Biol., 8, R69, 10.1186/gb-2007-8-5-r69
Borenstein, 2010, A basic introduction to fixed-effect and random-effects models for meta-analysis, Res. Synth. Methods, 1, 97, 10.1002/jrsm.12
Brockwell, 2001, A comparison of statistical methods for meta-analysis, Stat. Med., 20, 825, 10.1002/sim.650
Goldstein, 2011
Viechtbauer, 2005, Bias and efficiency of meta-analytic variance estimators in the random-effects model, J. Educ. Behav. Stat., 30, 261, 10.3102/10769986030003261
Cochran, 1954, The combination of estimates from different experiments, Biometrics, 10, 101, 10.2307/3001666
Paul, 1992, Small sample performance of tests of homogeneity of odds ratios in K 2 x 2 tables, Stat. Med., 11, 159, 10.1002/sim.4780110203
Hardy, 1998, Detecting and describing heterogeneity in meta-analysis, Stat. Med., 17, 841, 10.1002/(SICI)1097-0258(19980430)17:8<841::AID-SIM781>3.0.CO;2-D
Higgins, 2003, Measuring inconsistency in meta-analyses, Bmj, 327, 557, 10.1136/bmj.327.7414.557
Higgins, 2002, Quantifying heterogeneity in a meta-analysis, Stat. Med., 21, 1539, 10.1002/sim.1186
Lin, 2009, Integration of ranked lists via cross entropy Monte Carlo with applications to mRNA and microRNA Studies, Biometrics, 65, 9, 10.1111/j.1541-0420.2008.01044.x
Deng, 2014, Bayesian aggregation of order-based rank data, J. Am. Stat. Assoc., 109, 1023, 10.1080/01621459.2013.878660
Hong, 2006, RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis, Bioinformatics, 22, 2825, 10.1093/bioinformatics/btl476
Dreyfuss, 2009, Meta-analysis of glioblastoma multiforme versus anaplastic astrocytoma identifies robust gene markers, Mol. Cancer, 8, 71, 10.1186/1476-4598-8-71
Zintzaras, 2008, Meta-analysis for ranked discovery datasets: theoretical framework and empirical demonstration for microarrays, Comput. Biol. Chem., 32, 38, 10.1016/j.compbiolchem.2007.09.003
DeConde, 2006
Hong, 2008, A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments, Bioinformatics, 24, 374, 10.1093/bioinformatics/btm620
Li, 2019, A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications, Brief. Bioinformatics, 20, 178, 10.1093/bib/bbx101
Balding, 2007
Liang, 2000, Statistical designs for familial aggregation, Stat. Methods Med. Res., 9, 543, 10.1177/096228020000900603
Jarvik, 1998, Complex segregation analyses: uses and limitations, Am. J. Hum. Genet., 63, 942, 10.1086/302075
Genetic Approaches to Familial Aggregation. II. Segregation Analysis. In Fundamentals of Genetic Epidemiology. pp 233-283.
Elston, 1998, Methods of linkage analysis--and the assumptions underlying them [see comment], Am. J. Hum. Genet., 63, 931, 10.1086/302073
MD, 2005, Genetic genetic linkage, Lancet, 366, 1036, 10.1016/S0140-6736(05)67382-5
Kruglyak, 1996, Parametric and nonparametric linkage analysis: a unified multipoint approach, Am. J. Hum. Genet., 58, 1347
Malkin, 2011, Li-fraumeni syndrome, Genes Cancer, 2, 475, 10.1177/1947601911413466
Varley, 1997, Li-Fraumeni syndrome--a molecular and clinical review, Br. J. Cancer, 76, 1, 10.1038/bjc.1997.328
Balding, 2006, A tutorial on statistical methods for population association studies, Nature reviews, 7, 781, 10.1038/nrg1916
Chung, 2010, Genome-wide association studies in cancer--current and future directions, Carcinogenesis, 31, 111, 10.1093/carcin/bgp273
Capasso, 2009, Common variations in BARD1 influence susceptibility to high-risk neuroblastoma, Nat. Genet., 41, 718, 10.1038/ng.374
Maris, 2008, Chromosome 6p22 locus associated with clinically aggressive neuroblastoma, N. Engl. J. Med., 358, 2585, 10.1056/NEJMoa0708698
Kanehisa, 2017, KEGG: new perspectives on genomes, pathways, diseases and drugs, Nucleic Acids Res., 45, D353, 10.1093/nar/gkw1092
Kanehisa, 2000, KEGG: kyoto encyclopedia of genes and genomes, Nucleic Acids Res., 28, 27, 10.1093/nar/28.1.27
Ashburner, 2000, Gene ontology: tool for the unification of biology. The Gene Ontology Consortium, Nat Genet, 25, 25, 10.1038/75556
Lachmann, 2010, ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments, Bioinformatics, 26, 2438, 10.1093/bioinformatics/btq466
Mezzapelle, 2013, Mutation analysis of the EGFR gene and downstream signalling pathway in histologic samples of malignant pleural mesothelioma, Br. J. Cancer, 108, 1743, 10.1038/bjc.2013.130
Goeman, 2007, Analyzing gene expression data in terms of gene sets: methodological issues, Bioinformatics, 23, 980, 10.1093/bioinformatics/btm051
Fridley, 2011, Gene set analysis of SNP data: benefits, challenges, and future directions, Eur. J. Hum. Genet., 19, 837, 10.1038/ejhg.2011.57
Subramanian, 2005, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proc. Natl. Acad. Sci. U. S. A., 102, 15545, 10.1073/pnas.0506580102
Chen, 2013, Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool, BMC Bioinformatics, 14, 128, 10.1186/1471-2105-14-128
Kuleshov, 2016, Enrichr: a comprehensive gene set enrichment analysis web server 2016 update, Nucleic Acids Res., 44, W90, 10.1093/nar/gkw377
Huang da, 2009, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nat. Protoc., 4, 44, 10.1038/nprot.2008.211
Dennis, 2003, DAVID: database for annotation, visualization, and integrated discovery, Genome Biol., 4, 10.1186/gb-2003-4-9-r60
Ferreira, 2008, Array CGH and gene-expression profiling reveals distinct genomic instability patterns associated with DNA repair and cell-cycle checkpoint pathways in Ewing’s sarcoma, Oncogene, 27, 2084, 10.1038/sj.onc.1210845
Kikuta, 2009, Nucleophosmin as a candidate prognostic biomarker of Ewing’s sarcoma revealed by proteomics, Clin. Cancer Res., 15, 2885, 10.1158/1078-0432.CCR-08-1913
Goeman, 2004, A global test for groups of genes: testing association with a clinical outcome, Bioinformatics, 20, 93, 10.1093/bioinformatics/btg382
Biernacka, 2012, Use of the gamma method for self-contained gene-set analysis of SNP data, Eur. J. Hum. Genet., 20, 565, 10.1038/ejhg.2011.236
Fridley, 2013, Soft truncation thresholding for gene set analysis of RNA-seq data: application to a vaccine study, Sci. Rep., 3, 2898, 10.1038/srep02898
de Rooij, 2017, Pediatric non-Down syndrome acute megakaryoblastic leukemia is characterized by distinct genomic subsets with varying outcomes, Nat. Genet., 49, 451, 10.1038/ng.3772
Saelens, 2018, A comprehensive evaluation of module detection methods for gene expression data, Nat. Commun., 9, 1090, 10.1038/s41467-018-03424-4
Werhli, 2006, Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks, Bioinformatics, 22, 2523, 10.1093/bioinformatics/btl391
Grzegorczyk, 2007, Extracting protein regulatory networks with graphical models, Proteomics, 1, 51, 10.1002/pmic.200700466
Butte, 2000, Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks, Proc. Natl. Acad. Sci. U. S. A., 97, 12182, 10.1073/pnas.220392197
Langfelder, 2008, WGCNA: an R package for weighted correlation network analysis, BMC Bioinformatics, 9, 559, 10.1186/1471-2105-9-559
Zhang, 2005, A general framework for weighted gene co-expression network analysis, Stat. Appl. Genet. Mol. Biol., 4, 10.2202/1544-6115.1128
Yip, 2007, Gene network interconnectedness and the generalized topological overlap measure, BMC Bioinformatics, 8, 22, 10.1186/1471-2105-8-22
Wang, 2019, Weighted gene coexpression network analysis for identifying hub genes in association with prognosis in Wilms tumor, Mol. Med. Rep., 19, 2041
Yuan, 2018, Co-expression network analysis of biomarkers for adrenocortical carcinoma, Front. Genet., 9, 328, 10.3389/fgene.2018.00328
Zhang, 2019, Co-expression network analysis identified gene signatures in Osteosarcoma as a predictive tool for lung metastasis and survival, J. Cancer, 10, 3706, 10.7150/jca.32092
Schafer, 2005, An empirical Bayes approach to inferring large-scale gene association networks, Bioinformatics, 21, 754, 10.1093/bioinformatics/bti062
Zhao, 2019, Cancer genetic network inference using gaussian graphical models, Bioinform. Biol. Insights, 13, 10.1177/1177932219839402
Friedman, 2000, Using bayesian networks to analyze expression data, J. Comput. Biol., 7, 601, 10.1089/106652700750050961
Ni, 2018, Bayesian graphical models for computational network biology, BMC Bioinformatics, 19, 63, 10.1186/s12859-018-2063-z
Bulashevska, 2010, Bayesian statistical modelling of human protein interaction network incorporating protein disorder information, BMC Bioinformatics, 11, 46, 10.1186/1471-2105-11-46
Hill, 2012, Bayesian inference of signaling network topology in a cancer cell line, Bioinformatics, 28, 2804, 10.1093/bioinformatics/bts514
Kramer, 2009, Regularized estimation of large-scale gene association networks using graphical Gaussian models, BMC Bioinformatics, 10, 384, 10.1186/1471-2105-10-384
Yin, 2011, A sparse conditional gaussian graphical model for analysis of genetical genomics data, Ann. Appl. Stat., 5, 2630, 10.1214/11-AOAS494
Chun, 2015, Gene regulation network inference with joint sparse Gaussian graphical models, J. Comput. Graph. Stat., 24, 954, 10.1080/10618600.2014.956876
Blum, 2016, Sparse factor model for co-expression networks with an application using prior biological knowledge, Stat. Appl. Genet. Mol. Biol., 15, 253, 10.1515/sagmb-2015-0002
Serra, 2018, Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data, Bioinformatics, 34, 625, 10.1093/bioinformatics/btx642
Schafer, 2005, A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics, Stat. Appl. Genet. Mol. Biol., 4, 10.2202/1544-6115.1175
Kristensen, 2014, Principles and methods of integrative genomic analyses in cancer, Nat. Rev. Cancer, 14, 299, 10.1038/nrc3721
Wu, 2019, A selective review of multi-level omics data integration using variable selection, High Throughput, 8
Jiang, 2016, Integrated analysis of multidimensional omics data on cutaneous melanoma prognosis, Genomics, 107, 223, 10.1016/j.ygeno.2016.04.005
Zhao, 2015, Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA, Brief Bioinform, 16, 291, 10.1093/bib/bbu003
Kandoth, 2013, Mutational landscape and significance across 12 major cancer types, Nature, 502, 333, 10.1038/nature12634
Zack, 2013, Pan-cancer patterns of somatic copy number alteration, Nat. Genet., 45, 1134, 10.1038/ng.2760
Chen, 2018, A pan-cancer analysis of enhancer expression in nearly 9000 patient samples, Cell, 173, 10.1016/j.cell.2018.03.027
Sanchez-Vega, 2018, Oncogenic signaling pathways in the Cancer genome atlas, Cell, 173, e310
Rosario, 2018, Pan-cancer analysis of transcriptional metabolic dysregulation using the cancer genome atlas, Nat. Commun., 9, 5330, 10.1038/s41467-018-07232-8
Network, 2012, Comprehensive molecular portraits of human breast tumours, Nature, 490, 61, 10.1038/nature11412
Radovich, 2018, The integrated genomic landscape of thymic epithelial tumors, Cancer Cell, 33, e210
Shen, 2009, Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis, Bioinformatics, 25, 2906, 10.1093/bioinformatics/btp543
Shen, 2013, Sparse integrative clustering of multiple omics data sets, Ann. Appl. Stat., 7, 269, 10.1214/12-AOAS578
Mo, 2013, Pattern discovery and cancer gene identification in integrated cancer genomic data, Proc. Natl. Acad. Sci. U. S. A., 110, 4245, 10.1073/pnas.1208949110
Mo, 2018, A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data, Biostatistics, 19, 71, 10.1093/biostatistics/kxx017
Brunet, 2004, Metagenes and molecular pattern discovery using matrix factorization, Proc. Natl. Acad. Sci. U. S. A., 101, 4164, 10.1073/pnas.0308531101
Gao, 2005, Improving molecular cancer class discovery through sparse non-negative matrix factorization, Bioinformatics, 21, 3970, 10.1093/bioinformatics/bti653
Kim, 2007, Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis, Bioinformatics, 23, 1495, 10.1093/bioinformatics/btm134
Monti, 2003, Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data, Mach. Learn., 52, 91, 10.1023/A:1023949509487
Zhang, 2012, Discovery of multi-dimensional modules by integrative analysis of cancer genomic data, Nucleic Acids Res., 40, 9379, 10.1093/nar/gks725
Yang, 2016, A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data, Bioinformatics, 32, 1, 10.1093/bioinformatics/btv544
Chalise, 2017, Integrative clustering of multi-level’ omic data based on non-negative matrix factorization algorithm, PLoS One, 12, 10.1371/journal.pone.0176278
