maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments

Bioinformatics (Oxford, England) - Tập 22 Số 9 - Trang 1096-1102 - 2006
Ana Conesa1, María José Nueda2, Alberto Ferrer3, Manuel Talón1
1Centro de Genómica. Instituto Valenciano de Investigaciones Agrarias, Apartado Oficial 46113 1   1     Moncada, Valencia, Spain
2Departamento de Estadística e Investigación Operativa. Universidad de Alicante 2   2     Apartado 03080, Alicante Spain
3Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universidad Politécnica de Valencia 3   3     Apartado 46022, Valencia, Spain

Tóm tắt

Abstract

Motivation: Multi-series time-course microarray experiments are useful approaches for exploring biological processes. In this type of experiments, the researcher is frequently interested in studying gene expression changes along time and in evaluating trend differences between the various experimental groups. The large amount of data, multiplicity of experimental conditions and the dynamic nature of the experiments poses great challenges to data analysis.

Results: In this work, we propose a statistical procedure to identify genes that show different gene expression profiles across analytical groups in time-course experiments. The method is a two-regression step approach where the experimental groups are identified by dummy variables. The procedure first adjusts a global regression model with all the defined variables to identify differentially expressed genes, and in second a variable selection strategy is applied to study differences between groups and to find statistically significant different profiles. The methodology is illustrated on both a real and a simulated microarray dataset.

Availability: The method has been implemented in the statistical language R and is freely available from the Bioconductor contributed packages repository and from

Contact:  [email protected]; [email protected]

Từ khóa


Tài liệu tham khảo

Bar-Joseph, 2003, Comparing the continuous representation of time series expression profiles to identify differentially expressed genes, Proc. Natl Acad. Sci. USA, 100, 10146, 10.1073/pnas.1732547100

Bar-Joseph, 2004, Analyzing time series gene expression data, Bioinformatics, 20, 2493, 10.1093/bioinformatics/bth283

Beal, 2005, A Bayesian approach to reconstructing genetic regulatory networks with hidden factors, Bioinformatics, 21, 349, 10.1093/bioinformatics/bti014

Draghici, 2003, Data Analysis Tools for DNA Microarrays, 10.1201/9780203486078

Draper, 1998, Applied Regression Analysis, 3rd edn, 10.1002/9781118625590

Ernst, 2005, Clustering short time series gene expression data, Bioinformatics, 21, 159, 10.1093/bioinformatics/bti1022

Harrell, 2002, Regression Modeling Strategies: With Applications To Linear Models, Logistic Regression And Survival Analysis

Heijne, 2003, Toxicogenomics of bromobenzene hepatotoxicity: a combined transcriptomics and proteomics approach, Biochem. Pharmacol., 65, 857, 10.1016/S0006-2952(02)01613-1

Herrero, 2001, A hierarchical unsupervised growing neural network for clustering gene expression patterns, Bioinformatics, 17, 126, 10.1093/bioinformatics/17.2.126

Kerr, 2000, Analysis of variance for gene expression microarray data, J. Comput. Biol., 7, 819, 10.1089/10665270050514954

Kerr, 2001, Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments, Proc. Natl Acad. Sci. USA, 98, 8961, 10.1073/pnas.161273698

Liu, 2005, Quadratic regression analysis for gene discovery and pattern recognition for non-cyclic short time-course microarray experiments, BMC Bioinformatics, 6, 106, 10.1186/1471-2105-6-106

Luan, 2003, Clustering of time-course gene expression data using a mixed-effects models with B-splines, Bioinformatics, 19, 474, 10.1093/bioinformatics/btg014

Lukashin, 2001, Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters, Bioinformatics, 17, 405, 10.1093/bioinformatics/17.5.405

Marsh, 2001, Spline regression models

Pan, 2002, A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments, Bioinformatics, 18, 546, 10.1093/bioinformatics/18.4.546

Park, 2003, Statistical tests for identifying differentially expressed genes in time-course microarray experiments, Bioinformatics, 19, 694, 10.1093/bioinformatics/btg068

Peddada, 2003, Gene selection and clustering for time-course and dose–response microarray experiments using order-restricted inference, Bioinformatics, 19, 834, 10.1093/bioinformatics/btg093

Reiner, 2003, Identifying differentially expressed genes using false discovery rate controlling procedures, Bioinformatics, 19, 368, 10.1093/bioinformatics/btf877

Smyth, 2004, Linear models and empirical Bayes methods for assessing differential expression in microarray experiments, Stat. Appl. Genet. Mol. Biol., 3, 10.2202/1544-6115.1027

Speed, 2003, Statistical Analysis of Gene Expression Microarray Data, 10.1201/9780203011232

Spellman, 1998, Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization, Mol. Biol. Cell, 9, 3273, 10.1091/mbc.9.12.3273

Tusher, 2001, Significance analysis of microarrays applied to transcriptional responses to ionizing radiation, Proc. Natl Acad. Sci., 98, 5116, 10.1073/pnas.091062498

Vittinghoff, 2005, Regression Methods in Biostatistics. Linear, Logistic, Survival, and Repeated Measures Models

Wolfinger, 2001, Assessing gene significance from cDNA microarray expression data via mixed models, J. Comput. Biol., 8, 625, 10.1089/106652701753307520

Xu, 2002, A regression-based method to identify differentially expressed genes in microarray time course studies and its application in an inducible Huntington's disease transgenic model, Hum Mol Genet., 11, 1977, 10.1093/hmg/11.17.1977