Inferring the functions of longevity genes with modular subnetwork biomarkers of Caenorhabditis elegansaging

Genome Biology - Tập 11 - Trang 1-15 - 2010
Kristen Fortney1, Max Kotlyar1, Igor Jurisica1,2,3
1Department of Medical Biophysics, University of Toronto, Toronto, Canada
2The Campbell Family Institute for Cancer Research and Ontario Cancer Institute, Toronto, Canada
3Department of Computer Science, University of Toronto, Toronto, Canada

Tóm tắt

A central goal of biogerontology is to identify robust gene-expression biomarkers of aging. Here we develop a method where the biomarkers are networks of genes selected based on age-dependent activity and a graph-theoretic property called modularity. Tested on Caenorhabditis elegans, our algorithm yields better biomarkers than previous methods - they are more conserved across studies and better predictors of age. We apply these modular biomarkers to assign novel aging-related functions to poorly characterized longevity genes.

Tài liệu tham khảo

Kim SK: Common aging pathways in worms, flies, mice and humans. J Exp Biol. 2007, 210: 1607-1612. 10.1242/jeb.004887. Golden TR, Hubbard A, Dando C, Herren MA, Melov S: Age-related behaviors have distinct transcriptional profiles in Caenorhabditis elegans. Aging Cell. 2008, 7: 850-865. 10.1111/j.1474-9726.2008.00433.x. Bahar R, Hartmann CH, Rodriguez KA, Denny AD, Busuttil RA, Dolle ME, Calder RB, Chisholm GB, Pollock BH, Klein CA, Vijg J: Increased cell-to-cell variation in gene expression in ageing mouse heart. Nature. 2006, 441: 1011-1014. 10.1038/nature04844. Pan F, Chiu CH, Pulapura S, Mehan MR, Nunez-Iglesias J, Zhang K, Kamath K, Waterman MS, Finch CE, Zhou XJ: Gene Aging Nexus: a web database and data mining platform for microarray data on aging. Nucleic Acids Res. 2007, 35: D756-759. 10.1093/nar/gkl798. de Magalhaes JP, Curado J, Church GM: Meta-analysis of age-related gene expression profiles identifies common signatures of aging. Bioinformatics. 2009, 25: 875-881. 10.1093/bioinformatics/btp073. Budovsky A, Abramovich A, Cohen R, Chalifa-Caspi V, Fraifeld V: Longevity network: construction and implications. Mech Ageing Dev. 2007, 128: 117-124. 10.1016/j.mad.2006.11.018. Promislow DE: Protein networks, pleiotropy and the evolution of senescence. Proc Biol Sci. 2004, 271: 1225-1234. 10.1098/rspb.2004.2732. Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ: Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci USA. 2005, 102: 13544-13549. 10.1073/pnas.0506577102. Zhang M, Yao C, Guo Z, Zou J, Zhang L, Xiao H, Wang D, Yang D, Gong X, Zhu J, Li Y, Li X: Apparently low reproducibility of true differential expression discoveries in microarray studies. Bioinformatics. 2008, 24: 2057-2063. 10.1093/bioinformatics/btn365. Boutros PC, Lau SK, Pintilie M, Liu N, Shepherd FA, Der SD, Tsao MS, Penn LZ, Jurisica I: Prognostic gene signatures for non-small-cell lung cancer. Proc Natl Acad Sci USA. 2009, 106: 2824-2828. 10.1073/pnas.0809444106. Chuang HY, Lee E, Liu YT, Lee D, Ideker T: Network-based classification of breast cancer metastasis. Mol Syst Biol. 2007, 3: 140-10.1038/msb4100180. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005, 102: 15545-15550. 10.1073/pnas.0506580102. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556. Hwang T, Park T: Identification of differentially expressed subnetworks based on multivariate ANOVA. BMC Bioinformatics. 2009, 10: 128-10.1186/1471-2105-10-128. Liu M, Liberzon A, Kong SW, Lai WR, Park PJ, Kohane IS, Kasif S: Network-based analysis of affected biological processes in type 2 diabetes models. PLoS Genet. 2007, 3: e96-10.1371/journal.pgen.0030096. Xue H, Xian B, Dong D, Xia K, Zhu S, Zhang Z, Hou L, Zhang Q, Zhang Y, Han JD: A modular network model of aging. Mol Syst Biol. 2007, 3: 147-10.1038/msb4100189. Wang X, Dalkic E, Wu M, Chan C: Gene module level analysis: identification to networks and dynamics. Curr Opin Biotechnol. 2008, 19: 482-491. 10.1016/j.copbio.2008.07.011. Nacu S, Critchley-Thorne R, Lee P, Holmes S: Gene expression network analysis and applications to immunology. Bioinformatics. 2007, 23: 850-858. 10.1093/bioinformatics/btm019. Ulitsky I, Shamir R: Identification of functional modules using network topology and high-throughput data. BMC Syst Biol. 2007, 1: 8-10.1186/1752-0509-1-8. de Magalhaes JP, Budovsky A, Lehmann G, Costa J, Li Y, Fraifeld V, Church GM: The Human Ageing Genomic Resources: online databases and tools for biogerontologists. Aging Cell. 2009, 8: 65-72. 10.1111/j.1474-9726.2008.00442.x. Budovskaya YV, Wu K, Southworth LK, Jiang M, Tedesco P, Johnson TE, Kim SK: An elt-3/elt-5/elt-6 GATA transcription circuit guides aging in C. elegans. Cell. 2008, 134: 291-303. 10.1016/j.cell.2008.05.044. Ideker T, Ozier O, Schwikowski B, Siegel AF: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics. 2002, 18 (Suppl 1): S233-240. Ulitsky I, Karp R, Shamir R: Detecting disease-specific dysregulated pathways via analysis of clinical expression profiles. Research in Computational Molecular Biology. 2008, Berlin/Heidelberg: Springer, 347-359. full_text. [Lecture Notes in Computer Science, volume 4955/2008], Dittrich M, Klau G, Rosenwald A, Dandekar T, Müller T: Identifying functional modules in protein-protein interaction networks: an integrated exact approach. Bioinformatics. 2008, 24: i223-231. 10.1093/bioinformatics/btn161. Palla G, Derenyi I, Farkas I, Vicsek T: Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005, 435: 814-818. 10.1038/nature03607. Newman ME: Modularity and community structure in networks. Proc Natl Acad Sci USA. 2006, 103: 8577-8582. 10.1073/pnas.0601602103. Spirin V, Mirny LA: Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA. 2003, 100: 12123-12128. 10.1073/pnas.2032324100. King AD, Przulj N, Jurisica I: Protein complex prediction via cost-based clustering. Bioinformatics. 2004, 20: 3013-3020. 10.1093/bioinformatics/bth351. Marbach D, Schaffter T, Mattiussi C, Floreano D: Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J Comput Biol. 2009, 16: 229-239. 10.1089/cmb.2008.09TT. Lancichinetti A, Fortunato S, Kertész J: Detecting the overlapping and hierarchical community structure in complex networks. New J Phys. 2009, 11: 10.1088/1367-2630/11/3/033015. Clauset A: Finding local community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2005, 72: 026132- Bair E, Tibshirani R: Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2004, 2: E108-10.1371/journal.pbio.0020108. Simon R, Radmacher MD, Dobbin K, McShane LM: Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst. 2003, 95: 14-18. Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc. 1995, 57: 289-300. Brown KR, Otasek D, Ali M, McGuffin MJ, Xie W, Devani B, Toch IL, Jurisica I: NAViGaTOR: Network Analysis, Visualization and Graphing Toronto. Bioinformatics. 2009, 25: 3327-3329. 10.1093/bioinformatics/btp595. Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J, Salwinski L, Ceol A, Moore S, Orchard S, Sarkans U, von Mering C, Roechert B, Poux S, Jung E, Mersch H, Kersey P, Lappe M, Li Y, Zeng R, Rana D, Nikolski M, Husi H, Brun C, Shanker K, Grant SG, Sander C, Bork P, Zhu W, Pandey A, Brazma A, Jacq B, et al: The HUPO PSI's molecular interaction format - a community standard for the representation of protein interaction data. Nat Biotechnol. 2004, 22: 177-183. 10.1038/nbt926. Supplementary material. [http://www.cs.utoronto.ca/~juris/data/GB10/] Rouault JP, Kuwabara PE, Sinilnikova OM, Duret L, Thierry-Mieg D, Billaud M: Regulation of dauer larva development in Caenorhabditis elegans by daf-18, a homologue of the tumour suppressor PTEN. Curr Biol. 1999, 9: 329-332. 10.1016/S0960-9822(99)80143-2. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, Yamanishi Y: KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008, 36: D480-484. 10.1093/nar/gkm882. Sharan R, Ulitsky I, Shamir R: Network-based prediction of protein function. Mol Syst Biol. 2007, 3: 88-10.1038/msb4100129. Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001, 29: 22-28. 10.1093/nar/29.1.22. Chang CC, Lin CJ: LIBSVM: a library for support vector machines. [http://www.csie.ntu.edu.tw/~cjlin/libsvm/] Alexa A, Rahnenfuhrer J, Lengauer T: Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006, 22: 1600-1607. 10.1093/bioinformatics/btl140. NAViGaTOR - Network Analysis, Visualization, & Graphing TORonto. [http://ophid.utoronto.ca/navigator/] Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30: 207-210. 10.1093/nar/30.1.207. Lee I, Lehner B, Crombie C, Wong W, Fraser AG, Marcotte EM: A single gene network accurately predicts phenotypic effects of gene perturbation in Caenorhabditis elegans. Nat Genet. 2008, 40: 181-188. 10.1038/ng.2007.70. Smith ED, Tsuchiya M, Fox LA, Dang N, Hu D, Kerr EO, Johnston ED, Tchao BN, Pak DN, Welton KL, Promislow DE, Thomas JH, Kaeberlein M, Kennedy BK: Quantitative evidence for conserved longevity pathways between divergent eukaryotic species. Genome Res. 2008, 18: 564-570. 10.1101/gr.074724.107. Goeman JJ, Buhlmann P: Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics. 2007, 23: 980-987. 10.1093/bioinformatics/btm051. Smola A, Scholkopf B: A tutorial on support vector regression. Stat Comput. 2004, 14: 199-222. 10.1023/B:STCO.0000035301.49549.88.