A Resource of Quantitative Functional Annotation for<i>Homo sapiens</i>Genes

G3: Genes, Genomes, Genetics - Tập 2 Số 2 - Trang 223-233 - 2012
Murat Taşan1,2, Harold J. Drabkin3, John Beaver1, Hon Nian Chua1,2, Julie Dunham2, Weidong Tian4, Judith A. Blake3, Frederick P. Roth5,1,2,6
1Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115.
2Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, Ontario M5S-3E1, Canada
3Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, Maine 04609
4Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai 200433, P. R. China
5Center for Cancer Systems Biology, Dana Farber Cancer Institute, Boston, Massachusetts 02115
6Samuel Lunenfeld Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G-1X5, Canada

AbstractThe body of human genomic and proteomic evidence continues to grow at ever-increasing rates, while annotation efforts struggle to keep pace. A surprisingly small fraction of human genes have clear, documented associations with specific functions, and new functions continue to be found for characterized genes. Here we assembled an integrated collection of diverse genomic and proteomic data for 21,341 human genes and make quantitative associations of each to 4333 Gene Ontology terms. We combined guilt-by-profiling and guilt-by-association approaches to exploit features unique to the data types. Performance was evaluated by cross-validation, prospective validation, and by manual evaluation with the biological literature. Functional-linkage networks were also constructed, and their utility was demonstrated by identifying candidate genes related to a glioma FLN using a seed network from genome-wide association studies. Our annotations are presented—alongside existing validated annotations—in a publicly accessible and searchable web interface.

