Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics

Nature Communications - Tập 9 Số 1
Alvaro Barbeira1, Scott Dickinson1, Rodrigo Bonazzola1, Jiamao Zheng1, Heather E. Wheeler2, Jason Torres3, Eric S. Torstenson4, Kaanan P. Shah1, Tzintzuni Garcia5, Todd L. Edwards6, Eli A. Stahl7, Laura M. Huckins7, François Aguet8, Kristin Ardlie8, Beryl B. Cummings8, Ellen Gelfand8, Gad Getz8, Kane Hadley8, Robert E. Handsaker8, Katherine Huang8, Seva Kashin8, Konrad J. Karczewski8, Monkol Lek8, Xiao Li8, Daniel G. MacArthur8, Jared L. Nedzel8, Duyen T. Nguyen8, Michael S. Noble8, Ayellet V. Segrè8, Casandra A. Trowbridge8, Taru Tukiainen8, Nathan S. Abell9, Brunilda Balliu10, Ruth Barshir11, Omer Basha11, Alexis Battle12, Gireesh K. Bogu13, Andrew Brown14, Christopher Brown15, Stephane E. Castel16, Lin Chen17, Colby Chiang18, Donald F. Conrad19, Farhan N. Damani12, Joe R. Davis9, Olivier Delaneau14, Emmanouil T. Dermitzakis14, Barbara E. Engelhardt20, Eleazar Eskin21, Pedro G. Ferreira22, Laure Frésard9, Eric R. Gamazon23, Diego Garrido-Martín13, Ariel DH Gewirtz24, Genna Gliner25, Michael J. Gloudemans9, Roderic Guigó13, Ira M. Hall18, Buhm Han26, Yuan He27, Farhad Hormozdiari21, Cédric Howald14, Brian Jo24, Eun Yong Kang21, Yungil Kim12, Sarah Kim-Hellmuth16, Tuuli Lappalainen16, Gen Li14, Xin Li10, Boxiang Liu10, Serghei Mangul21, Mark I. McCarthy28, Ian C. McDowell29, Pejman Mohammadi16, Jean Monlong13, Stephen B. Montgomery10, Manuel Muñoz-Aguirre13, Anne W. Ndungu28, Andrew B. Nobel30, Meritxell Oliva31, Halit Ongen14, John Palowitch30, Nikolaos Panousis14, Panagiotis Papasaikas13, YoSon Park15, Princy Parsana12, A. J. Payne28, Christine B. Peterson32, Jie Quan33, Ferrán Reverter13, Chiara Sabatti34, Ashis Saha12, Michael Sammeth35, Alexandra J. Scott18, Andrey A. Shabalin36, Reza Sodaei13, Matthew Stephens37, Barbara E. Stranger31, Benjamin J. Strober27, Jae Hoon Sul38, Emily K. Tsang10, Sarah Urbut39, Martijn van de Bunt28, Gao Wang39, Xiaoquan Wen40, Fred A. Wright41, Hualin Simon Xi33, Esti Yeger‐Lotem11, Zachary Zappala10, Judith B. Zaugg42, Yi‐Hui Zhou41, Joshua M. Akey24, Daniel J. Bates43, Joanne Chan9, Melina Claussnitzer8, Kathryn Demanelis17, Morgan Diegel43, Jennifer A. Doherty44, Andrew P. Feinberg27, Marian S. Fernando31, Jessica Halow43, Kasper D. Hansen45, Eric Haugen43, Peter F. Hickey46, Lei Hou8, Farzana Jasmine17, Ruiqi Jian9, Lihua Jiang9, Audra Johnson43, Rajinder Kaul43, Manolis Kellis8, Muhammad G. Kibriya17, Kristen Lee43, Jin Billy Li9, Qin Li9, Jessica Lin9, Shin Lin9, Sandra E. Linder10, Caroline Linke31, Yaping Liu47, Matthew T. Maurano48, Benoit Molinié8, Jemma Nelson43, Fidencio Neri43, Yongjin Park47, Brandon L. Pierce17, Nicola J. Rinaldi47, Lindsay F. Rizzardi45, Richard Sandstrom43, Andrew D. Skol31, Kevin S. Smith10, M Snyder9, J Stamatoyannopoulos43, Hua Tang9, Li Wang8, Meng Wang9, Nicholas Van Wittenberghe8, Fan Wu31, Rui Zhang9, Concepcion R. Nierras49, Philip A. Branton50, Latarsha J. Carithers50, Ping Guan50, Helen M. Moore50, Abhi K. Rao50, Jimmie B. Vaught50, Sarah E. Gould51, Nicole C. Lockart51, Casey Martin51, Jeffery P. Struewing51, Simona Volpi51, Anjené Addington52, Susan E. Koester52, A. Roger Little53, Lori E. Brigham54, Richard Hasz55, Marcus Anthony Hunter56, Christopher Johns57, Mark R. Johnson58, Gene Kopen59, William F. Leinweber59, John T. Lonsdale59, Alisa McDonald59, Bernadette Mestichelli59, Kevin Myer56, Brian Roe56, Michael F. Salvatore59, Saboor Shad59, Jeffrey A. Thomas59, Gary Walters58, Michael Washington58, J. Gary Wheeler57, Jason Bridge60, Barbara A. Foster61, Bryan M. Gillard61, Ellen Karasik61, Rachna Kumar61, Mark Miklos60, Michael T. Moser61, Scott D. Jewell62, Robert G. Montroy62, Daniel C. Rohrer62, Dana R. Valley62, David A. Davis63, Deborah C. Mash63, Anita H. Undale64, Anna Marie Smith65, David E. Tabor65, Nancy Roche65, Jeffrey A. McLean65, Negin Vatanian65, Karna Robinson65, Leslie H. Sobin65, Mary E. Barcus66, Kimberly M. Valentino65, Liqun Qi65, Steven Hunter65, Pushpa Hariharan65, Shilpi Singh65, Ki Sung Um65, Takunda Matose65, M. Tomaszewski65, Laura K. Barker67, Maghboeba Mosavel68, Laura A. Siminoff67, Heather M. Traino67, Paul Flicek69, Thomas Juettemann69, Magali Ruffier69, Dan Sheppard69, Kieron Taylor69, Stephen J. Trevanion69, Daniel R. Zerbino69, Brian Craft70, Mary J. Goldman70, Maximilian Haeussler70, W. James Kent70, Christopher M. Lee70, Benedict Paten70, Kate R. Rosenbloom70, John Vivian70, Jingchun Zhu70, Dan L. Nicolae1, Nancy J. Cox4, Hae Kyung Im1
1Section of Genetic Medicine, The University of Chicago, Chicago, IL 60637, USA.
2Department of Biology, Loyola University Chicago, Chicago IL 60660, USA
3Committee on Molecular Metabolism and Nutrition, The University of Chicago, Chicago, IL 60637, USA
4Vanderbilt Genetic Institute, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
5Center for Research Informatics, The University of Chicago, Chicago, IL, 60615, USA
6Division of Epidemiology, Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
7Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, NYC, NY, 10029, USA
8The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
9Department of Genetics, Stanford University, Stanford, CA 94305, USA
10Department of Pathology, Stanford University, Stanford, CA 94305, USA
11Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, BenGurion University of the Negev, Beer-Sheva, 84105, Israel
12Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
13Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain
14Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
15Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104 USA
16New York Genome Center, New York, NY 10013, USA
17Department of Public Health Sciences, The University of Chicago, Chicago, IL 60637, USA
18McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, 63108, USA
19Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA
20Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, 08540, USA
21Department of Computer Science, University of California, Los Angeles, CA, 90095, USA
22Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
23Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, 37232, USA
24Lewis Sigler Institute, Princeton University, Princeton, NJ, 08540, USA
25Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ, 08540, USA
26Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, 138-736, South Korea
27Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218 USA
28Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN, UK
29Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, NC, 27708, USA
30Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC, 27599, USA
31Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA
32Department of Biostatistics, The University of Texas MD Anderson Cancer Center; Houston, TX 77030 USA
33Computational Sciences, Pfizer Inc, Cambridge, MA, 02139, USA
34Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
35Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), 21941902, Rio de Janeiro, Brazil
36Department of Psychiatry, University of Utah, Salt Lake City, UT 84108, USA
37Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
38Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, 90095, USA
39Department of Human Genetics, The University of Chicago, Chicago, IL 60637 USA
40Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
41Bioinformatics Research Center and Departments of Statistics and Biological Sciences, North Carolina State University, Raleigh, NC, 27695, USA
42European Molecular Biology Laboratory, 69117 Heidelberg, Germany
43Altius Institute for Biomedical Sciences, Seattle, Washington, 98121, USA
44Huntsman Cancer Institute, Department of Population Health Sciences, University of Utah, Salt Lake City, UT, 84112, USA
45Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
46Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
47Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
48Institute for Systems Genetics, New York University Langone Medical Center, New York, NY 10016, USA
49Office of Strategic Coordination, Division of Program Coordination, Planning and Strategic Initiatives, Office of the Director, NIH, Rockville, MD, 20852, USA
50Biorepositories and Biospecimen Research Branch, Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD, 20892, USA
51Division of Genomic Medicine, National Human Genome Research Institute, Rockville, MD, 20852, USA
52Division of Neuroscience and Basic Behavioral Science, National Institute of Mental Health, NIH, Bethesda, MD, 20892, USA
53Division of Neuroscience and Behavior, National Institute on Drug Abuse, NIH, Bethesda, MD, 20892, USA
54Washington Regional Transplant Community, Falls Church, VA, 22003, USA
55Gift of Life Donor Program, Philadelphia, PA, 19103, USA
56LifeGift, Houston, TX, 77055, USA
57Center for Organ Recovery and Education, Pittsburgh, PA, 15238, USA
58LifeNet Health, Virginia Beach, VA, 23453, USA
59National Disease Research Interchange, Philadelphia, PA, 19103, USA
60Unyts, Buffalo, NY, 14203, USA
61Pharmacology and Therapeutics, Roswell Park Cancer Institute, Buffalo, NY, 14263, USA
62Van Andel Research Institute, Grand Rapids, MI 49503, USA
63Brain Endowment Bank, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA
64National Institute of Allergy and Infectious Diseases, NIH, Rockville, MD 20852, USA
65Biospecimen Research Group, Clinical Research Directorate, Leidos Biomedical Research, Inc., Rockville, MD, 20852, USA
66Leidos Biomedical Research, Inc., Frederick, MD 21701, USA
67Temple University Philadelphia, PA 19122 USA
68Department of Health Behavior and Policy, School of Medicine, Virginia Commonwealth University, Richmond, VA, 23298, USA
69European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
70UCSC Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA

Tóm tắt

Abstract

Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.

Từ khóa


Tài liệu tham khảo

Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLOS Genet. 6, 1000895 (2010).

Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLOS Genet. 6, e1000888, (2010).

Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

Zhang, X. et al. Identification of common genetic variants controlling transcript isoform variation in human whole blood. Nat. Genet. 47, 345–352 (2015).

Stranger, B. E. et al. Patterns of Cis regulatory variation in diverse human populations. PLOS Genet. 8, e1002639 (2012).

The GTEx Consortium. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–5 (2013).

Aguet F., et al. Local genetic effects on gene expression across 44 human tissues. Preprint at bioRxiv: http://biorxiv.org/content/early/2016/09/09/074450 (2016).

Gamazon, E. R. et al. A genebased association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

Smoller, J. W. et al. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–9 (2013).

Deloukas, P. et al. Large scale association analysis identifies new risk loci for coronary artery disease. Nat. Genet. 45, 25–33 (2013).

Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).

Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–7 (2016).

He, X. et al. Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. Am. J. Hum. Genet. 92, 667–680 (2013).

Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLOS Genet. 10, e1004383 (2014).

Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLOS Genet. 13, e1006646 (2017).

WTCCC. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–78 (2007).

Manor, O. & Segal, E. Robust prediction of expression differences among human individuals using only genotype information. PLOS Genet. 9, e1003396 (2013).

Hamilton N. ggtern: an extension to’ggplot2’, for the creation of ternary diagrams https://CRAN.R-project.org/package=ggtern (R package version 2.2.0, 2016).

Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).

Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with bayesian sparse linear mixed models. PLOS Genet. 9, e1003264 (2013).

Zou, H. & Hastie, T. Regularization and variable selection via the elastic-net. J. R. Stat. Soc. 67, 301–320 (2005).

Wheeler, H. E. et al. Survey of the heritability and sparse architecture of gene expression traits across human tissues. PLOS Genet. 12, e1006423 (2016).

Pavlides, J. M. W. et al. Predicting gene targets from integrative analyses of summary data from GWAS and eQTL studies for 28 human complex traits. Genome Med. 8, 1–6 (2016).

Westra, H. J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic acids Res. 44, D862–8 (2015).

Shah N., et al. Identification of misclassified ClinVar variants using disease population prevalence. Preprint at biorxiv. http://biorxiv.org/lookup/doi/10.1101/075416 (2016).

Sekar, A. et al. Schizophrenia risk from complex variation of complement component 4. Nature 530, 177–83 (2016).

Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–9 (2010).

Dadu, R. T. & Ballantyne, C. M. Lipid lowering with PCSK9 inhibitors. Nat. Publ. Group. 11, 563–575 (2014).

Franzén, O. et al. Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).

Hoffmann, T. J. et al. Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation. Nat. Genet. 49, 54–64 (2016).

Cook, J. P. & Morris, A. P. Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility. Eur. J. Hum. Genet. 24, 1175–1180 (2016).

Torres J. M., et al. Integrative cross tissue analysis of gene expression identifies novel type 2 diabetes genes. Preprint at bioRxiv http://biorxiv.org/content/early/2017/02/27/108134 (2017).

Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

Castel S. E., et al. Modified penetrance of coding variants by cis-regulatory variation shapes human traits. Preprint at bioRxiv. https://www.biorxiv.org/content/early/2017/09/18/190397 (2017).

Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).

Loh, P. R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).

McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).

Heath, A. P. et al. Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets. J. Am. Med. Inform. Assoc. 21, 969–975 (2014).