Target discovery from protein databases: challenges for curation

Drug Discovery Today: Technologies - Tập 14 - Trang 11-16 - 2015
Christine Chichester1, Pascale Gaudet1
1Swiss Institute of Bioinformatics, CALIPHO Group, CMU - Rue Michel-Servet 1, 1211 Geneva 4, Switzerland

Tài liệu tham khảo

Taylor, 2007, The minimum information about a proteomics experiment (MIAPE), Nat Biotechnol, 25, 887, 10.1038/nbt1329 Hermjakob, 2004, The HUPO PSI's molecular interaction format – a community standard for the representation of protein interaction data, Nat Biotechnol, 22, 177, 10.1038/nbt926 Vizcaino, 2013, The Proteomics Identifications (PRIDE) database and associated tools: status in 2013, Nucleic Acids Res, 41, D1063, 10.1093/nar/gks1262 Farrah, 2014, State of the human proteome in 2013 as viewed through PeptideAtlas: comparing the kidney, urine, and plasma proteomes for the biology- and disease-driven Human Proteome Project, J Proteome Res, 13, 60, 10.1021/pr4010037 Chichester, 2014, Converting neXtProt into Linked Data and nanopublications, Semantic Web J, 5 Jupp, 2014, The EBI RDF platform: linked open data for the life sciences, Bioinformatics, 30, 1338, 10.1093/bioinformatics/btt765 Petryszak, 2014, Expression Atlas update-a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments, Nucl Acids Res, 42, D926, 10.1093/nar/gkt1270 Chichester, 2014, Querying neXtProt nanopublications and their value for insights on sequence variants and tissue expression, J Web Semantics, 10.1016/j.websem.2014.05.001 Oberle, 2009, What is an ontology? Balakrishnan, 2013, A guide to best practices for Gene Ontology (GO) manual annotation, Database (Oxford), 2013, 10.1093/database/bat054 Köhler, 2014, The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data, Nucleic Acids Res, 42, D966, 10.1093/nar/gkt1026 Mungall, 2012, Uberon, an integrative multi-species anatomy ontology, Genome Biol, 13, R5, 10.1186/gb-2012-13-1-r5 Howe, 2008, Big data: the future of biocuration, Nature, 455, 47, 10.1038/455047a Stelzer, 2011, In-silico human genomics with GeneCards, Hum Genomics, 5, 709, 10.1186/1479-7364-5-6-709 Burge, 2012, Biocurators and biocuration: surveying the 21st century challenges, Database (Oxford), 2012, 10.1093/database/bar059 Schnoes, 2009, Annotation error in public databases: misannotation of molecular function in enzyme superfamilies, PLoS Comput Biol, 5, e1000605, 10.1371/journal.pcbi.1000605 Vasilevsky, 2013, On the reproducibility of science: unique identification of research resources in the biomedical literature, Peer J, 1, e148, 10.7717/peerj.148 Poux, 2014, Expert curation in UniProtKB: a case study on dealing with conflicting and erroneous data, Database (Oxford), 2014, 10.1093/database/bau016 Wang, 2008, Alternative isoform regulation in human tissue transcriptomes, Nature, 456, 470, 10.1038/nature07509 Lane, 2012, neXtProt: a knowledge platform for human proteins, Nucleic Acids Res, 40, D76, 10.1093/nar/gkr1179 Gaudet, 2013, neXtProt: organizing protein knowledge in the context of human proteome projects, J Proteome Res, 12, 293, 10.1021/pr300830v UniProt Consortium, 2014, Activities at the Universal Protein Resource (UniProt), Nucleic Acids Res, 42, D191, 10.1093/nar/gku469 Hunter, 2012, InterPro in 2011: new developments in the family and domain prediction database, Nucleic Acids Res, 40, D306, 10.1093/nar/gkr948 Gaudet, 2011, Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium, Brief Bioinform, 12, 449, 10.1093/bib/bbr042 OhÉigeartaigh, 2011, Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments, BMC Genomics, 12, 377, 10.1186/1471-2164-12-377 Friedberg, 2006, Automated protein function prediction – the genomic challenge, Brief Bioinform, 7, 225, 10.1093/bib/bbl004 Sherry, 2001, dbSNP: the NCBI database of genetic variation, Nucleic Acids Res, 29, 308, 10.1093/nar/29.1.308 Forbes, 2011, COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer, Nucleic Acids Res, 39, D945, 10.1093/nar/gkq929 Orchard, 2012, Protein interaction data curation: the International Molecular Exchange (IMEx) consortium, Nat Methods, 9, 345, 10.1038/nmeth.1931 Kerrien, 2007, Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions, BMC Biol, 5, 44, 10.1186/1741-7007-5-44 Orchard, 2007, The minimum information required for reporting a molecular interaction experiment (MIMIx), Nat Biotechnol, 25, 894, 10.1038/nbt1324 Kerrien, 2012, The IntAct molecular interaction database in 2012, Nucleic Acids Res, 40, D841, 10.1093/nar/gkr1088 Berman, 2000, The Protein Data Bank, Nucleic Acids Res, 28, 235, 10.1093/nar/28.1.235 Murzin, 1995, SCOP: a structural classification of proteins database for the investigation of sequences and structures, J Mol Biol, 247, 536, 10.1016/S0022-2836(05)80134-2 Greene, 2007, The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution, Nucleic Acids Res, 35, D291, 10.1093/nar/gkl959 Villoutreix, 2009, Structure-based virtual ligand screening: recent success stories, Comb Chem High Throughput Screen, 12, 1000, 10.2174/138620709789824682 Albou, 2011, M-ORBIS: mapping of molecular binding sites and surfaces, Nucleic Acids Res, 39, 30, 10.1093/nar/gkq736 Liebel, 2003, A microscope-based screening platform for large-scale functional protein analysis in intact cells, FEBS Lett, 554, 394, 10.1016/S0014-5793(03)01197-9 Simpson, 2000, Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing, EMBO Rep, 1, 287, 10.1093/embo-reports/kvd058 Sigal, 2007, Generation of a fluorescently labeled endogenous protein library in living human cells, Nat Protoc, 2, 1515, 10.1038/nprot.2007.197 Fagerberg, 2014, Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics, Mol Cell Proteomics, 13, 397, 10.1074/mcp.M113.035600 Rosikiewicz, 2013, Uncovering hidden duplicated content in public transcriptomics data, Database, 2013, Bat010, 10.1093/database/bat010 Dimmer, 2012, The UniProt-GO Annotation database in 2011, Nucleic Acids Res, 40, D565, 10.1093/nar/gkr1048