Prioritizing candidate disease genes by network-based boosting of genome-wide association data

Genome Research - Tập 21 Số 7 - Trang 1109-1121 - 2011
Insuk Lee1, U. Martin Blom2,3, Peggy I. Wang2, Jung Eun Shim1, Edward M. Marcotte2,3
1Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, 262 Seongsanno, Seodaemun-gu, Seoul 120-749, Korea;
2Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas, Austin, Texas, 78712, USA
3Program in Computational and Applied Mathematics, University of Texas, Austin, Texas 78712, USA;

Tóm tắt

Network “guilt by association” (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK–STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.

Từ khóa


Tài liệu tham khảo

10.1152/physiolgenomics.00015.2005

10.1101/gr.087528.108

10.1126/science.1156409

10.1093/hmg/ddp120

10.1038/ng.175

10.1002/ibd.20365

10.1016/j.cell.2008.05.008

10.1038/nchembio.122

10.1016/j.molmed.2010.06.004

10.1126/science.1152725

10.1186/jbiol58

10.1038/nmeth923

10.1158/1055-9965.EPI-07-2830

10.1186/1471-2105-10-73

10.1093/nar/gkp427

2007, Systems-level insights into cellular regulation: inferring, analysing, and modelling intracellular networks, IET Syst Biol, 1, 61, 10.1049/iet-syb:20060071

10.1038/nprot.2007.324

10.1038/ng.249

10.1126/science.1180823

10.1038/ng1640

10.1371/journal.pbio.1000294

10.1016/j.stem.2009.03.009

10.1038/ejhg.2009.15

10.1038/47056

10.1093/nar/30.7.1575

10.1093/nar/gki498

10.1038/nrmicro1949

10.1016/0092-8674(93)90546-3

10.1101/gr.086660.108

10.1086/504300

10.1186/gb-2007-8-11-r252

10.1038/ng.122

10.1371/journal.pgen.1000782

10.1186/1471-2105-8-236

10.1371/journal.pgen.1001040

10.1084/jem.189.11.1707

10.1038/nrm1583

10.1101/gr.082214.108

10.1101/gr.071852.107

10.1093/nar/gkn760

10.1210/me.2008-0135

10.1242/jcs.020693

10.1038/nbt1295

10.1126/science.1099511

10.1371/journal.pone.0000988

Lee I , Narayanaswamy R , Marcotte EM . 2007b. Bioinformatic prediction of yeast gene function. In Yeast gene analysis (ed. I Stansfield, M Stark). Elsevier, Maryland Heights, MO.

10.1038/ng.2007.70

10.1038/nbt.1603

10.1101/gr.102749.109

10.1038/ng764

10.1038/ng1844

10.1038/ng.125

10.1038/nrg1470

10.1186/gb-2009-10-9-r91

10.1186/1471-2105-9-271

10.1016/j.cell.2009.05.006

10.1128/JVI.00079-09

10.1038/47048

10.1038/nrg2344

10.1186/gb-2007-8-12-r258

10.1073/pnas.0910200107

10.1038/ng1197-271

10.1016/S0022-3476(95)70250-4

2008, GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function, Genome Biol, 9, S1

1999, Activation of Raf-1 during experimental gastric ulcer healing is Ras-mediated and protein kinase C-independent, Am J Pathol, 155, 1759, 10.1016/S0002-9440(10)65491-0

10.1038/ng.610

10.1093/nar/gkn756

2004, Quantitative genomics: exploring the genetic architecture of complex trait predisposition, J Anim Sci, 82, E300

10.1002/dmrr.882

10.1093/bioinformatics/btp461

10.1084/jem.20031187

10.1084/jem.20062694

10.1038/nbt1103

10.1371/journal.pgen.1001273

10.1093/bioinformatics/btn315

10.1126/science.1149200

10.1038/82360

10.1038/msb4100129

10.1196/annals.1407.021

10.1038/msb.2008.4

10.1126/science.1065810

10.1126/science.1091317

10.1073/pnas.0832373100

10.1002/ana.10112

van Dongen S . 2000. A cluster algorithm for graphs. National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam.

10.1146/annurev-genom-082908-150013

10.1038/ng0508-489

10.1016/j.jprot.2010.07.005

10.1016/j.ajhg.2010.04.003

10.1038/nrg2884

10.1038/ng.121

10.1038/nature05911

10.1093/bioinformatics/btn593

10.1186/gb-2010-11-5-r53

10.1038/ng.608

10.1038/ng.120