Bioinformatics and genomic databases

Handbook of Clinical Neurology - Tập 147 - Trang 75-92 - 2018
Jason Chen1, Giovanni Coppola1
1Interdepartmental Program in Bioinformatics and Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, CA, United States

Tài liệu tham khảo

Adzhubei, 2010, A method and server for predicting damaging missense mutations, Nat Methods, 7, 248, 10.1038/nmeth0410-248 Amberger, 2015, OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders, Nucleic Acids Res, 43, D789, 10.1093/nar/gku1205 Andreasen, 2013, New population-based exome data are questioning the pathogenicity of previously cardiomyopathy-associated genetic variants, Eur J Hum Genet, 21, 918, 10.1038/ejhg.2012.283 Aravind, 2000, Guilt by association: contextual information in genome analysis, Genome Res, 10, 1074, 10.1101/gr.10.8.1074 Bannister, 2011, Regulation of chromatin by histone modifications, Cell Res, 21, 381, 10.1038/cr.2011.22 Bannister, 2001, Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain, Nature, 410, 120, 10.1038/35065138 Barabasi, 2011, Network medicine: a network-based approach to human disease, Nat Rev Genet, 12, 56, 10.1038/nrg2918 Barrett, 2013, NCBI GEO: archive for functional genomics data sets – update, Nucleic Acids Res, 41, D991, 10.1093/nar/gks1193 Basu, 2009, AutDB: a gene reference resource for autism research, Nucleic Acids Res, 37, D832, 10.1093/nar/gkn835 Beadle, 1941, Genetic control of biochemical reactions in neurospora, Proc Natl Acad Sci U S A, 27, 499, 10.1073/pnas.27.11.499 Becker, 2003, PubMatrix: a tool for multiplex literature mining, BMC Bioinformatics, 4, 61, 10.1186/1471-2105-4-61 Bernstein, 2010, The NIH roadmap epigenomics mapping consortium, Nat Biotech, 28, 1045, 10.1038/nbt1010-1045 Boyle, 2012, Annotation of functional variation in personal genomes using RegulomeDB, Genome Res, 22, 1790, 10.1101/gr.137323.112 Brazma, 2001, Minimum information about a microarray experiment (MIAME) – toward standards for microarray data, Nat Genet, 29, 365, 10.1038/ng1201-365 Cassa, 2013, Large numbers of genetic variants considered to be pathogenic are common in asymptomatic individuals, Hum Mutat, 34, 1216, 10.1002/humu.22375 Chatr-aryamontri, 2015, The BioGRID interaction database: 2015 update, Nucleic Acids Res, 43, D470, 10.1093/nar/gku1204 Chen, 2004, Content-rich biological network constructed by mining PubMed abstracts, BMC Bioinformatics, 5, 147, 10.1186/1471-2105-5-147 Croft, 2014, The Reactome pathway knowledgebase, Nucleic Acids Res, 42, D472, 10.1093/nar/gkt1102 Cruts, 2012, Locus-specific mutation databases for neurodegenerative brain diseases, Hum Mutat, 33, 1340, 10.1002/humu.22117 Cunningham, 2015, Ensembl 2015, Nucleic Acids Res, 43, D662, 10.1093/nar/gku1010 de Ligt, 2012, Diagnostic exome sequencing in persons with severe intellectual disability, N Engl J Med, 367, 1921, 10.1056/NEJMoa1206524 Dougherty, 2005, PBK/TOPK, a proliferating neural progenitor-specific mitogen-activated protein kinase kinase, J Neurosci, 25, 10773, 10.1523/JNEUROSCI.3207-05.2005 Dudbridge, 2013, Power and predictive accuracy of polygenic risk scores, PLoS Genet, 9, 10.1371/annotation/b91ba224-10be-409d-93f4-7423d502cba0 Edwards, 2002, Bridging structural biology and genomics: assessing protein interaction data with known complexes, Trends Genet, 18, 529, 10.1016/S0168-9525(02)02763-4 Ernst, 2012, ChromHMM: automating chromatin-state discovery and characterization, Nat Methods, 9, 215, 10.1038/nmeth.1906 Ernst, 2011, Mapping and analysis of chromatin state dynamics in nine human cell types, Nature, 473, 43, 10.1038/nature09906 Fu, 2013, Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants, Nature, 493, 216, 10.1038/nature11690 Galperin, 2015, The 2015 nucleic acids research database issue and molecular biology database collection, Nucleic Acids Res, 43, D1, 10.1093/nar/gku1241 Gamazon, 2015, A gene-based association method for mapping traits using reference transcriptome data, Nat Genet, 47, 1091, 10.1038/ng.3367 Gonorazky, 2016, RNAseq analysis for the diagnosis of muscular dystrophy, Ann Clin Transl Neurol, 3, 55, 10.1002/acn3.267 Guan, 2012, Tissue-specific functional networks for prioritizing phenotype and disease genes, PLoS Comput Biol, 8, 10.1371/journal.pcbi.1002694 Gusev, 2014, Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases, Am J Hum Genet, 95, 535, 10.1016/j.ajhg.2014.10.004 Hawrylycz, 2014, The Allen brain atlas, 1111 Hegi, 2008, Correlation of O6-methylguanine methyltransferase (MGMT) promoter methylation with clinical outcomes in glioblastoma and clinical strategies to modulate MGMT activity, J Clin Oncol, 26, 4189, 10.1200/JCO.2007.11.5964 Hoffmann, 2004, A gene network for navigating the literature, Nat Genet, 36, 664, 10.1038/ng0704-664 Horvath, 2006, Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target, Proc Natl Acad Sci, 103, 17402, 10.1073/pnas.0608396103 Howie, 2011, Genotype imputation with thousands of genomes. G3: Genes, Genomes, Genetics, 1, 457 Huang, 2007, Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps, PLoS Comput Biol, 3, 10.1371/journal.pcbi.0030214 Huang, 2008, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nat Protocols, 4, 44, 10.1038/nprot.2008.211 Huang, 2012, 1000 Genomes-based imputation identifies novel and refined associations for the Wellcome Trust case control consortium phase 1 data, Eur J Hum Genet, 20, 801, 10.1038/ejhg.2012.3 International Human Genome Sequencing Consortium, 2001, Initial sequencing and analysis of the human genome, Nature, 409, 860, 10.1038/35057062 International Parkinson Disease Genomics Consortium, 2011, Imputation of sequence variants for identification of genetic risks for Parkinson's disease: a meta-analysis of genome-wide association studies, The Lancet, 377, 641, 10.1016/S0140-6736(10)62345-8 Jeong, 2000, The large-scale organization of metabolic networks, Nature, 407, 651, 10.1038/35036627 Kanehisa, 2000, KEGG: Kyoto Encyclopedia of Genes and Genomes, Nucleic Acids Res, 28, 27, 10.1093/nar/28.1.27 Kel, 2006, Beyond microarrays: finding key transcription factors controlling signal transduction pathways, BMC Bioinformatics, 7, S13, 10.1186/1471-2105-7-S2-S13 Kent, 2002, The human genome browser at UCSC, Genome Res, 12, 996, 10.1101/gr.229102 Kerrien, 2011 Kim, 2014, A draft map of the human proteome, Nature, 509, 575, 10.1038/nature13302 Kinsella, 2012, Ensembl BioMarts: a hub for data retrieval across taxonomic space, Database, 40, D841 Kodama, 2012, The sequence read archive: explosive growth of sequencing data, Nucleic Acids Res, 40, D54, 10.1093/nar/gkr854 Kolesnikov, 2015, ArrayExpress update – simplifying data submissions, Nucleic Acids Res, 43, D1113, 10.1093/nar/gku1057 Kulakovskiy, 2013, HOCOMOCO: a comprehensive collection of human transcription factor binding sites models, Nucleic Acids Res, 41, D195, 10.1093/nar/gks1089 Kuleshov, 2016, Enrichr: a comprehensive gene set enrichment analysis web server 2016 update, Nucleic Acids Res, 44, W90, 10.1093/nar/gkw377 Lachmann, 2010, ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments, Bioinformatics, 26, 2438, 10.1093/bioinformatics/btq466 Lachner, 2001, Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins, Nature, 410, 116, 10.1038/35065132 Lai, 2001, A forkhead-domain gene is mutated in a severe speech and language disorder, Nature, 413, 519, 10.1038/35097076 Landrum, 2014, ClinVar: public archive of relationships among sequence variation and human phenotype, Nucleic Acids Res, 42, D980, 10.1093/nar/gkt1113 Langfelder, 2008, WGCNA: an R package for weighted correlation network analysis, BMC Bioinformatics, 9, 559, 10.1186/1471-2105-9-559 Lappalainen, 2013, Transcriptome and genome sequencing uncovers functional variation in humans, Nature, 501, 506, 10.1038/nature12531 Lek, 2016, Exome Aggregation Consortium. Analysis of protein-coding genetic variation in 60,706 humans, Nature, 536, 285, 10.1038/nature19057 Li, 2014, An epigenetic signature in peripheral blood associated with the haplotype on 17q21.31, a risk factor for neurodegenerative tauopathy, PLoS Genet, 10, e1004211, 10.1371/journal.pgen.1004211 Li, 2017, A scored human protein–protein interaction network to catalyze genomic interpretation, Nat Meth, 14, 61, 10.1038/nmeth.4083 Licata, 2012, MINT, the molecular interaction database: 2012 update, Nucleic Acids Res, 40, D857, 10.1093/nar/gkr930 Lim, 2013, Rare complete knockouts in humans: population distribution and significant role in autism spectrum disorders, Neuron, 77, 235, 10.1016/j.neuron.2012.12.029 Liu, 2013, dbNSFP v2.0: A database of human non-synonymous SNVs and their functional predictions and annotations, Hum Mutat, 34, E2393, 10.1002/humu.22376 Lonsdale, 2013, The Genotype-Tissue Expression (GTEx) project, Nat Genet, 45, 580, 10.1038/ng.2653 Lowrance, 2007, Identifiability in genomic research, Science, 317, 600, 10.1126/science.1147699 Magger, 2012, Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks, PLoS Comput Biol, 8, 10.1371/journal.pcbi.1002690 Mathelier, 2014, JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles, Nucleic Acids Res, 42, D142, 10.1093/nar/gkt997 Matys, 2006, TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes, Nucleic Acids Res, 34, D108, 10.1093/nar/gkj143 Maurano, 2012, Systematic localization of common disease-associated variation in regulatory DNA, Science, 337, 1190, 10.1126/science.1222794 McLaren, 2010, Deriving the consequences of genomic variants with the Ensembl API and SNP effect predictor, Bioinformatics, 26, 2069, 10.1093/bioinformatics/btq330 Mellacheruvu, 2013, The CRAPome: a contaminant repository for affinity purification-mass spectrometry data, Nat Methods, 10, 730, 10.1038/nmeth.2557 Messina, 2004, An ORFeome-based analysis of human transcription factor genes and the construction of a microarray to interrogate their expression, Genome Res, 14, 2041, 10.1101/gr.2584104 Miller, 2008, A systems level analysis of transcriptional changes in Alzheimer's disease and normal aging, J Neurosci, 28, 1410, 10.1523/JNEUROSCI.4098-07.2008 Mrowka, 2001, Is there a bias in proteome research?, Genome Res, 11, 1971, 10.1101/gr.206701 NCBI Resource Coordinators, 2015, Database resources of the National Center for Biotechnology Information, Nucleic Acids Res, 43, D6, 10.1093/nar/gku1130 Ng, 2010, Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome, Nat Genet, 42, 790, 10.1038/ng.646 Nicolae, 2010, Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS, PLoS Genet, 6, 10.1371/journal.pgen.1000888 Novarino, 2014, Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders, Science, 343, 506, 10.1126/science.1247363 Novembre, 2008, Genes mirror geography within Europe, Nature, 456, 98, 10.1038/nature07331 Okamura, 2015, COXPRESdb in 2015: coexpression database for animal species by DNA-microarray and RNAseq-based expression data with multiple quality assessment systems, Nucleic Acids Res, 43, D82, 10.1093/nar/gku1163 Oldham, 2008, Functional organization of the transcriptome in human brain, Nat Neurosci, 11, 1271, 10.1038/nn.2207 Parikshak, 2013, Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism, Cell, 155, 1008, 10.1016/j.cell.2013.10.031 Parikshak, 2015, Systems biology and gene networks in neurodevelopmental and neurodegenerative disorders, Nat Rev Genet, 16, 441, 10.1038/nrg3934 Petryszak, 2014, Expression Atlas update – a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments, Nucleic Acids Res, 42, D926, 10.1093/nar/gkt1270 Pickrell, 2014, Joint analysis of functional genomic data and genome-wide association studies of 18 human traits, Am J Hum Genet, 94, 559, 10.1016/j.ajhg.2014.03.004 Pickrell, 2012, Comment on “Widespread RNA and DNA sequence differences in the human transcriptome”, Science, 335, 1302, 10.1126/science.1210484 Raj, 2012, Alzheimer disease susceptibility loci: evidence for a protein network under natural selection, Am J Hum Genet, 90, 720, 10.1016/j.ajhg.2012.02.022 Rauch, 2012, Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study, The Lancet, 380, 1674, 10.1016/S0140-6736(12)61480-9 Rebhan, 1998, GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support, Bioinformatics, 14, 656, 10.1093/bioinformatics/14.8.656 Rolland, 2014, A proteome-scale map of the human interactome network, Cell, 159, 1212, 10.1016/j.cell.2014.10.050 Rossin, 2011, Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology, PLoS Genet, 7, 10.1371/journal.pgen.1001273 Rual, 2005, Towards a proteome-scale map of the human protein-protein interaction network, Nature, 437, 1173, 10.1038/nature04209 Salwinski, 2004, The database of interacting proteins: 2004 update, Nucleic Acids Res, 32, D449, 10.1093/nar/gkh086 Schaub, 2012, Linking disease associations with regulatory information in the human genome, Genome Res, 22, 1748, 10.1101/gr.136127.111 Shogren-Knaak, 2006, Histone H4-K16 acetylation controls chromatin structure and protein interactions, Science, 311, 844, 10.1126/science.1124000 Stelzl, 2005, A human protein–protein interaction network: a resource for annotating the proteome, Cell, 122, 957, 10.1016/j.cell.2005.08.029 Stenson, 2014, The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine, Hum Genet, 133, 1, 10.1007/s00439-013-1358-4 Stuart, 2003, A gene-coexpression network for global discovery of conserved genetic modules, Science, 302, 249, 10.1126/science.1087447 Szklarczyk, 2015, STRING v10: protein–protein interaction networks, integrated over the tree of life, Nucleic Acids Res, 43, D447, 10.1093/nar/gku1003 Takahashi, 2006, Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors, Cell, 126, 663, 10.1016/j.cell.2006.07.024 The 1000 Genomes Project Consortium, 2012, An integrated map of genetic variation from 1,092 human genomes, Nature, 491, 56, 10.1038/nature11632 The ENCODE Project Consortium, 2012, An integrated encyclopedia of DNA elements in the human genome, Nature, 489, 57, 10.1038/nature11247 The Gene Ontology Consortium, 2015, Gene Ontology Consortium: going forward, Nucleic Acids Res, 43, D1049, 10.1093/nar/gku1179 The International HapMap Consortium, 2003, The International HapMap Project, Nature, 426, 789, 10.1038/nature02168 The UniProt Consortium, 2015, UniProt: a hub for protein information, Nucleic Acids Res, 43, D204, 10.1093/nar/gku989 Tryka, 2014, NCBI's database of genotypes and phenotypes: dbGaP, Nucleic Acids Res, 42, D975, 10.1093/nar/gkt1211 Uhlén, 2015, Tissue-based map of the human proteome, Science, 347 Verbist, 2015, Using transcriptomics to guide lead optimization in drug discovery projects: lessons learned from the QSTAR project, Drug Discov Today, 20, 505, 10.1016/j.drudis.2014.12.014 Voineagu, 2011, Transcriptomic analysis of autistic brain reveals convergent molecular pathology, Nature, 474, 380, 10.1038/nature10110 Wang, 2010, ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data, Nucleic Acids Res, 38, 10.1093/nar/gkq603 Wang, 2014, NIA Genetics of Alzheimer's Disease Data Storage Site (NIAGADS): 2014 update, Alzheimer's & Dementia: The Journal of the Alzheimer's Association, 10, P634, 10.1016/j.jalz.2014.05.1107 Welter, 2014, The NHGRI GWAS catalog, a curated resource of SNP-trait associations, Nucleic Acids Res, 42, D1001, 10.1093/nar/gkt1229 Wilhelm, 2014, Mass-spectrometry-based draft of the human proteome, Nature, 509, 582, 10.1038/nature13319 Wu, 2009, BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources, Genome Biol, 10, R130, 10.1186/gb-2009-10-11-r130 Xu, 2012, De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia, Nat Genet, 44, 1365, 10.1038/ng.2446 Yang, 2013, Clinical whole-exome sequencing for the diagnosis of Mendelian disorders, N Engl J Med, 369, 1502, 10.1056/NEJMoa1306555 Zambon, 2012, GO-Elite: a flexible solution for pathway and ontology over-representation, Bioinformatics, 28, 2209, 10.1093/bioinformatics/bts366 Zhou, 2013, Exploring long-range genome interactions using the WashU epigenome browser, Nat Methods, 10, 375, 10.1038/nmeth.2440 Zhu, 2015, Targeted exploration and analysis of large cross-platform human transcriptomic compendia, Nat Meth, 12, 211, 10.1038/nmeth.3249