Computational approaches to identify regulators of plant stress response using high-throughput gene expression data

Current Plant Biology - Tập 3 - Trang 20-29 - 2015
Alexandr Koryachko1, Anna Matthiadis2, Joel J. Ducoste3, James Tuck1, Terri A. Long2, Cranos Williams1
1Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, USA
2Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
3Civil, Construction, and Environmental Engineering, North Carolina State University, Raleigh, NC, USA

Tài liệu tham khảo

Mittler, 2010, Genetic engineering for modern agriculture: challenges and perspectives, Annu. Rev. Plant Biol., 61, 443, 10.1146/annurev-arplant-042809-112116 Vaahtera, 2011, More than the sum of its parts – how to achieve a specific transcriptional response to abiotic stress, Plant Sci., 180, 421, 10.1016/j.plantsci.2010.11.009 Wang, 2003, Plant responses to drought, salinity and extreme temperatures: towards genetic engineering for stress tolerance, Planta, 218, 1, 10.1007/s00425-003-1105-5 Valdés, 2014, Forced adaptation: plant proteins to fight climate change, Front. Plant Sci., 5, 762 Riechmann, 2000, Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes, Science, 290, 2105, 10.1126/science.290.5499.2105 Udvardi, 2007, Legume transcription factors: global regulators of plant development and response to the environment, Plant Physiol., 144, 538, 10.1104/pp.107.098061 Melzer, 2011, MADS and more: transcription factors that shape the plant, 3 Cramer, 2011, Effects of abiotic stress on plants: a systems biology perspective, BMC Plant Biol., 11, 163, 10.1186/1471-2229-11-163 Friedel, 2012, Reverse engineering: a key component of systems biology to unravel global abiotic stress cross-talk, Front. Plant Sci., 3, 294, 10.3389/fpls.2012.00294 Krouk, 2013, Gene regulatory networks in plants: learning causality from time and perturbation, Genome Biol., 14, 123, 10.1186/gb-2013-14-6-123 Sima, 2009, Inference of gene regulatory networks using time-series data: a survey, Curr. Genomics, 10, 416, 10.2174/138920209789177610 Cho, 2007, Reverse engineering of gene regulatory networks, Syst. Biol. IET, 1, 149, 10.1049/iet-syb:20060075 Hecker, 2009, Gene regulatory network inference: data integration in dynamic models – a review, Biosystems, 96, 86, 10.1016/j.biosystems.2008.12.004 Karlebach, 2008, Modelling and analysis of gene regulatory networks, Nat. Rev. Mol. Cell Biol., 9, 770, 10.1038/nrm2503 Middleton, 2012, Modeling regulatory networks to understand plant development: small is beautiful, Plant Cell, 24, 3876, 10.1105/tpc.112.101840 Atkinson, 2012, The interaction of plant biotic and abiotic stresses: from genes to the field, J. Exp. Bot., 63, 3523, 10.1093/jxb/ers100 de Sassi, 2012, Climate change disproportionately increases herbivore over plant or parasitoid biomass, PLOS ONE, 7, e40557, 10.1371/journal.pone.0040557 Kilian, 2007, The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses, Plant J., 50, 347, 10.1111/j.1365-313X.2007.03052.x López-Maury, 2008, Tuning gene expression to changing environments: from rapid responses to evolutionary adaptation, Nat. Rev. Genet., 9, 583, 10.1038/nrg2398 Ditt, 2006, The Arabidopsis thaliana transcriptome in response to Agrobacterium tumefaciens, Mol. Plant Microbe Interact., 19, 665, 10.1094/MPMI-19-0665 O’Connell, 2012, Lifestyle transitions in plant pathogenic colletotrichum fungi deciphered by genome and transcriptome analyses, Nat. Genet., 44, 1060, 10.1038/ng.2372 Windram, 2012, Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high-resolution temporal transcriptomic analysis, Plant Cell, 24, 3530, 10.1105/tpc.112.102046 Lee, 2005, The Arabidopsis cold-responsive transcriptome and its regulation by ICE1, Plant Cell, 17, 3155, 10.1105/tpc.105.035568 Iyer-Pascuzzi, 2011, Cell identity regulators link development and stress responses in the Arabidopsis root, Dev. Cell, 21, 770, 10.1016/j.devcel.2011.09.009 Dinneny, 2008, Cell identity mediates the response of Arabidopsis roots to abiotic stress, Science, 320, 942, 10.1126/science.1153795 González-Pérez, 2011, Early transcriptional defense responses in Arabidopsis cell suspension culture under high-light conditions, Plant Physiol., 156, 1439, 10.1104/pp.111.177766 Buckhout, 2009, Early iron-deficiency-induced transcriptional changes in Arabidopsis roots as revealed by microarray analyses, BMC Genomics, 10, 147, 10.1186/1471-2164-10-147 Long, 2010, The bHLH transcription factor POPEYE regulates response to iron deficiency in Arabidopsis roots, Plant Cell, 22, 2219, 10.1105/tpc.110.074096 Krouk, 2010, Predictive network modeling of the high-resolution dynamic plant transcriptome in response to nitrate, Genome Biol., 11, R123, 10.1186/gb-2010-11-12-r123 Lin, 2011, Coexpression-based clustering of Arabidopsis root genes predicts functional modules in early phosphate deficiency signaling, Plant Physiol., 110 Rizhsky, 2004, When defense pathways collide. The response of Arabidopsis to a combination of drought and heat stress, Plant Physiol., 134, 1683, 10.1104/pp.103.033431 Rasmussen, 2013, Transcriptome responses to combinations of stresses in Arabidopsis, Plant Physiol., 161, 1783, 10.1104/pp.112.210773 Prasch, 2013, Simultaneous application of heat, drought, and virus to Arabidopsis plants reveals significant shifts in signaling networks, Plant Physiol., 162, 1849, 10.1104/pp.113.221044 Sewelam, 2014, A step towards understanding plant responses to multiple environmental stresses: a genome-wide study, Plant Cell Environ., 37, 2024, 10.1111/pce.12274 Hahn, 2013, Plant core environmental stress response genes are systemically coordinated during abiotic stresses, Int. J. Mol. Sci., 14, 7617, 10.3390/ijms14047617 Riechmann, 2002, Transcriptional regulation: a genomic overview, The Arabidopsis Book, 16, 1 Cui, 2003, Statistical tests for differential expression in cDNA microarray experiments, Genome Biol., 4, 210, 10.1186/gb-2003-4-4-210 Smyth, 2005, Use of within-array replicate spots for assessing differential expression in microarray experiments, Bioinformatics, 21, 2067, 10.1093/bioinformatics/bti270 Anders, 2010, Differential expression analysis for sequence count data, Genome Biol., 11, R106, 10.1186/gb-2010-11-10-r106 Robinson, 2010, edgeR: a bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, 26, 139, 10.1093/bioinformatics/btp616 Aoki, 2007, Approaches for extracting practical information from gene co-expression networks in plant biology, Plant Cell Physiol., 48, 381, 10.1093/pcp/pcm013 Zhang, 2005, A general framework for weighted gene co-expression network analysis, Stat. Appl. Genet. Mol. Biol., 4, 1128, 10.2202/1544-6115.1128 Wolfe, 2005, Systematic survey reveals general applicability of “guilt-by-association” within gene coexpression networks, BMC Bioinform., 6, 227, 10.1186/1471-2105-6-227 Usadel, 2009, Co-expression tools for plant biology: opportunities for hypothesis generation and caveats, Plant Cell Environ., 32, 1633, 10.1111/j.1365-3040.2009.02040.x Lee, 2010, Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana, Nat. Biotechnol., 28, 149, 10.1038/nbt.1603 Gupta, 2006, Elucidation of directionality for co-expressed genes: predicting intra-operon termination sites, Bioinformatics, 22, 209, 10.1093/bioinformatics/bti780 Ehlting, 2008, An extensive (co-) expression analysis tool for the cytochrome P450 superfamily in Arabidopsis thaliana, BMC Plant Biol., 8, 47, 10.1186/1471-2229-8-47 Polanski, 2014, Wigwams: identifying gene modules co-regulated across multiple biological conditions, Bioinformatics, 30, 962, 10.1093/bioinformatics/btt728 Balasubramaniyan, 2005, Clustering of gene expression data using a local shape-based similarity measure, Bioinformatics, 21, 1069, 10.1093/bioinformatics/bti095 Nie, 2011, TF-Cluster: a pipeline for identifying functionally coordinated transcription factors via network decomposition of the shared coexpression connectivity matrix (SCCM), BMC Syst. Biol., 5, 53, 10.1186/1752-0509-5-53 Cui, 2010, Tf-finder: a software package for identifying transcription factors involved in biological processes using microarray data and existing knowledge base, BMC Bioinform., 11, 425, 10.1186/1471-2105-11-425 Kishino, 2000, Correspondence analysis of genes and tissue types and finding genetic links from microarray data, Genome Inform., 11, 83 Wille, 2004, Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana, Genome Biol., 5, R92, 10.1186/gb-2004-5-11-r92 Schäfer, 2005, An empirical bayes approach to inferring large-scale gene association networks, Bioinformatics, 21, 754, 10.1093/bioinformatics/bti062 D’haeseleer, 2005, How does gene expression clustering work?, Nat. Biotechnol., 23, 1499, 10.1038/nbt1205-1499 Ma, 2014, Machine learning-based differential network analysis: a study of stress-responsive transcriptomes in Arabidopsis, Plant Cell, 26, 520, 10.1105/tpc.113.121913 Altay, 2010, Inferring the conservative causal core of gene regulatory networks, BMC Syst. Biol., 4, 132, 10.1186/1752-0509-4-132 Margolin, 2006, ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context, BMC Bioinform., 7, S7, 10.1186/1471-2105-7-S1-S7 Steuer, 2002, The mutual information: detecting and evaluating dependencies between variables, Bioinformatics, 18, S231, 10.1093/bioinformatics/18.suppl_2.S231 Kumari, 2012, Evaluation of gene association methods for coexpression network construction and biological knowledge discovery, PLOS ONE, 7, e50411, 10.1371/journal.pone.0050411 Ma, 2012, Application of the gini correlation coefficient to infer regulatory relationships in transcriptome analysis, Plant Physiol., 160, 192, 10.1104/pp.112.201962 Rueda, 2008, Clustering time-series gene expression data with unequal time intervals, 100 Triska, 2013, cisExpress: motif detection in DNA sequences, Bioinformatics, 29, 2203, 10.1093/bioinformatics/btt366 Martin, 2007, Boolean dynamics of genetic regulatory networks inferred from microarray time series data, Bioinformatics, 23, 866, 10.1093/bioinformatics/btm021 Van Dongen, 2000, A cluster algorithm for graphs, Rep. Inf. Syst., 1 Mentzen, 2008, Regulon organization of Arabidopsis, BMC Plant Biol., 8, 99, 10.1186/1471-2229-8-99 Zhang, 2005, A time-series biclustering algorithm for revealing co-regulated genes, 32 Tamayo, 1999, Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation, Proc. Natl. Acad. Sci. U. S. A., 96, 2907, 10.1073/pnas.96.6.2907 Eisen, 1998, Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci. U. S. A., 95, 14863, 10.1073/pnas.95.25.14863 Frey, 2007, Clustering by passing messages between data points, Science, 315, 972, 10.1126/science.1136800 Li, 2013, MultiFacTV: module detection from higher-order time series biological data, BMC Genomics, 14, S2, 10.1186/1471-2164-14-S4-S2 Liu, 2012, Extracting plants core genes responding to abiotic stresses by penalized matrix decomposition, Comput. Biol. Med., 42, 582, 10.1016/j.compbiomed.2012.02.002 Chen, 1999, Identifying gene regulatory networks from experimental data, 94 Kwon, 2003, Inference of transcriptional regulation relationships from gene expression data, Bioinformatics, 19, 905, 10.1093/bioinformatics/btg106 Schmitt, 2004, Elucidation of gene interaction networks through time-lagged correlation analysis of transcriptional data, Genome Res., 14, 1654, 10.1101/gr.2439804 Zhao, 2006, Inferring gene regulatory networks from time series data using the minimum description length principle, Bioinformatics, 22, 2129, 10.1093/bioinformatics/btl364 Ma, 2008, Inferring gene regulatory networks from expression data by discovering fuzzy dependency relationships, IEEE Trans. Fuzzy Syst., 16, 455, 10.1109/TFUZZ.2007.894969 Redestig, 2007, Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana, BMC Bioinform., 8, 454, 10.1186/1471-2105-8-454 Heckerman, 2008, A tutorial on learning with Bayesian networks, Innov. Bayesian Netw., 33, 10.1007/978-3-540-85066-3_3 Friedman, 2000, Using Bayesian networks to analyze expression data, J. Comput. Biol., 7, 601, 10.1089/106652700750050961 Murphy, 1999 Dojer, 2006, Applying dynamic Bayesian networks to perturbed gene expression data, BMC Bioinform., 7, 249, 10.1186/1471-2105-7-249 Liang, 1998, REVEAL, a general reverse engineering algorithm for inference of genetic network architectures, 2 Albert, 2004, Boolean modelingof genetic regulatory networks, 459 Dimitrova, 2011, Parameter estimation for Boolean models of biological networks, Theor. Comput. Sci., 412, 2816, 10.1016/j.tcs.2010.04.034 Laubenbacher, 2004, A computational algebra approach to the reverse engineering of gene regulatory networks, J. Theor. Biol., 229, 523, 10.1016/j.jtbi.2004.04.037 Akutsu, 1999, Identification of genetic networks from a small number of gene expression patterns under the boolean network model, 17 Rosa, 2012, Optimal timepoint sampling in high-throughput gene expression experiments, Bioinformatics, 28, 2773, 10.1093/bioinformatics/bts511 Bernot, 2013, Modeling and analysis of gene regulatory networks, 47 Tibshirani, 1996, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. B: Methodological, 267 Guthke, 2005, Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection, Bioinformatics, 21, 1626, 10.1093/bioinformatics/bti226 M. Gustafsson, M. Hornquist, A. Lombardi, Large-scale reverse engineering by the lasso, arXiv preprint q-bio/0403012. Yeung, 2002, Reverse engineering gene networks using singular value decomposition and robust regression, Proc. Natl. Acad. Sci. U. S. A., 99, 6163, 10.1073/pnas.092576199 Gustafsson, 2009, Reverse engineering of gene networks with LASSO and nonlinear basis functions, Ann. N. Y. Acad. Sci., 1158, 265, 10.1111/j.1749-6632.2008.03764.x Palafox, 2013, Reverse engineering of gene regulatory networks using dissipative particle swarm optimization, IEEE Trans. Evol. Comput., 17, 577, 10.1109/TEVC.2012.2218610 Kabir, 2010, Reverse engineering gene regulatory network from microarray data using linear time-variant model, BMC Bioinform., 11, S56, 10.1186/1471-2105-11-S1-S56 Ashburner, 2000, Gene Ontology: tool for the unification of biology, Nat. Genet., 25, 25, 10.1038/75556 Maere, 2005, BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks, Bioinformatics, 21, 3448, 10.1093/bioinformatics/bti551 Alonso, 2003, Genome-wide insertional mutagenesis of Arabidopsis thaliana, Science, 301, 653, 10.1126/science.1086391 Sessions, 2002, A high-throughput Arabidopsis reverse genetics system, Plant Cell, 14, 2985, 10.1105/tpc.004630 Rosso, 2003, An Arabidopsis thaliana T-DNA mutagenized population (GABI-Kat) for flanking sequence tag-based reverse genetics, Plant Mol. Biol., 53, 247, 10.1023/B:PLAN.0000009297.37235.4a Ülker, 2008, Getting the most out of publicly available T-DNA insertion lines, Plant J., 56, 665, 10.1111/j.1365-313X.2008.03608.x Hilson, 2004, Versatile gene-specific sequence tags for Arabidopsis functional genomics: transcript profiling and reverse genetics applications, Genome Res., 14, 2176, 10.1101/gr.2544504 Coego, 2014, The TRANSPLANTA collection of Arabidopsis lines: a resource for functional analysis of transcription factors based on their conditional overexpression, Plant J., 77, 944, 10.1111/tpj.12443 Higo, 1999, Plant cis-acting regulatory DNA elements (PLACE) database: 1999, Nucleic Acids Res., 27, 297, 10.1093/nar/27.1.297 Davuluri, 2003, AGRIS. Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors, BMC Bioinform., 4, 25, 10.1186/1471-2105-4-25 Palaniswamy, 2006, AGRIS and AtRegNet. A platform to link cis-regulatory elements and transcription factors into regulatory networks, Plant Physiol., 140, 818, 10.1104/pp.105.072280 O’Connor, 2005, Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences, Bioinformatics, 21, 4411, 10.1093/bioinformatics/bti714 Brady, 2009, Web-queryable large-scale data sets for hypothesis generation in plant biology, Plant Cell, 21, 1034, 10.1105/tpc.109.066050 Bailey, 2009, MEME SUITE: tools for motif discovery and searching, Nucleic Acids Res., gkp335 Das, 2007, A survey of DNA motif finding algorithms, BMC Bioinform., 8, S21, 10.1186/1471-2105-8-S7-S21 Berendzen, 2009, Analysis of plant regulatory dna sequences by transient protoplast assays and computer aided sequence evaluation, 311 Wehner, 2011, High-throughput protoplast transactivation (PTA) system for the analysis of Arabidopsis transcription factor function, Plant J., 68, 560, 10.1111/j.1365-313X.2011.04704.x Olsen, 2014, Inference and validation of predictive gene networks from biomedical literature and gene expression data, Genomics, 103, 329, 10.1016/j.ygeno.2014.03.004 Chu, 2002, A systematic statistical linear modeling approach to oligonucleotide array experiments, Math. Biosci., 176, 35, 10.1016/S0025-5564(01)00107-9 Ma, 2007, Integration of Arabidopsis thaliana stress-related transcript profiles, promoter structures, and cell-specific expression, Genome Biol., 8, R49, 10.1186/gb-2007-8-4-r49 Gasch, 2002, Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering, Genome Biol., 3, 1, 10.1186/gb-2002-3-11-research0059 Smyth, 2004, Linear models and empirical bayes methods for assessing differential expression in microarray experiments, Stat. Appl. Genet. Mol. Biol., 3, 1, 10.2202/1544-6115.1027 Ma, 2012, Discovery of stress responsive DNA regulatory motifs in Arabidopsis, PLOS ONE, 7, e43198, 10.1371/journal.pone.0043198 Wu, 2003, MAANOVA: a software package for the analysis of spotted cDNA microarray experiments, 313 Stegle, 2010, A robust bayesian two-sample test for detecting intervals of differential gene expression in microarray time series, J. Comput. Biol., 17, 355, 10.1089/cmb.2009.0175 Tai, 2006, A multivariate empirical Bayes statistic for replicated microarray time course data, Ann. Stat., 34, 2387, 10.1214/009053606000000759 Heard, 2005, Bayesian coclustering of anopheles gene expression time series: study of immune defense response to multiple experimental challenges, Proc. Natl. Acad. Sci. U. S. A., 102, 16939, 10.1073/pnas.0408393102 Klemm, 2008 Howe, 2010, MeV: multiexperiment viewer, 267 Espinosa-Soto, 2004, A gene regulatory network model for cell-fate determination during Arabidopsis thaliana flower development that is robust and recovers experimental gene expression profiles, Plant Cell, 16, 2923, 10.1105/tpc.104.021725 Sankar, 2011, A qualitative continuous model of cellular auxin and brassinosteroid signaling and their crosstalk, Bioinformatics, 27, 1404, 10.1093/bioinformatics/btr158 Cruz-Ramí rez, 2012, A bistable circuit involving scarecrow-retinoblastoma integrates cues to inform asymmetric stem cell division, Cell, 150, 1002, 10.1016/j.cell.2012.07.017 Pokhilko, 2012, The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops, Mol. Syst. Biol., 8, 574, 10.1038/msb.2012.6 Vermeirssen, 2014, Arabidopsis ensemble reverse-engineered gene regulatory network discloses interconnected transcription factors in oxidative stress, Plant Cell, 10.1105/tpc.114.131417 Joshi, 2009, Module networks revisited: computational assessment and prioritization of model predictions, Bioinformatics, 25, 490, 10.1093/bioinformatics/btn658 Faith, 2007, Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles, PLoS Biol., 5, e8, 10.1371/journal.pbio.0050008 Montes, 2014, ARACNe-based inference, using curated microarray data, of Arabidopsis thaliana root transcriptional regulatory networks, BMC Plant Biol., 14, 97, 10.1186/1471-2229-14-97 Meyer, 2007, Information-theoretic inference of large transcriptional regulatory networks, EURASIP J. Bioinform. Syst. Biol., 2007, 79879, 10.1155/2007/79879 Lebre, 2010, Statistical inference of the time-varying structure of gene-regulation networks, BMC Syst. Biol., 4, 130, 10.1186/1752-0509-4-130 Marbach, 2012, Wisdom of crowds for robust gene network inference, Nat. Methods, 9, 796, 10.1038/nmeth.2016