MetabR: an R script for linear model analysis of quantitative metabolomic data

Springer Science and Business Media LLC - Tập 5 - Trang 1-10 - 2012
Ben Ernest1,2, Jessica R Gooding3, Shawn R Campagna3, Arnold M Saxton1,2, Brynn H Voy1,2
1Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, USA
2Department of Animal Science, University of Tennessee, Knoxville, USA
3Department of Chemistry, University of Tennessee, Knoxville, USA

Tóm tắt

Metabolomics is an emerging high-throughput approach to systems biology, but data analysis tools are lacking compared to other systems level disciplines such as transcriptomics and proteomics. Metabolomic data analysis requires a normalization step to remove systematic effects of confounding variables on metabolite measurements. Current tools may not correctly normalize every metabolite when the relationships between each metabolite quantity and fixed-effect confounding variables are different, or for the effects of random-effect confounding variables. Linear mixed models, an established methodology in the microarray literature, offer a standardized and flexible approach for removing the effects of fixed- and random-effect confounding variables from metabolomic data. Here we present a simple menu-driven program, “MetabR”, designed to aid researchers with no programming background in statistical analysis of metabolomic data. Written in the open-source statistical programming language R, MetabR implements linear mixed models to normalize metabolomic data and analysis of variance (ANOVA) to test treatment differences. MetabR exports normalized data, checks statistical model assumptions, identifies differentially abundant metabolites, and produces output files to help with data interpretation. Example data are provided to illustrate normalization for common confounding variables and to demonstrate the utility of the MetabR program. We developed MetabR as a simple and user-friendly tool for implementing linear mixed model-based normalization and statistical analysis of targeted metabolomic data, which helps to fill a lack of available data analysis tools in this field. The program, user guide, example data, and any future news or updates related to the program may be found at http://metabr.r-forge.r-project.org/ .

Tài liệu tham khảo

Nicholson JK, Connelly J, Lindon JC, Holmes E: Metabonomics: a platform for studying drug toxicity and gene function. Nat Rev Drug Discov. 2002, 1: 153-161. 10.1038/nrd728. Reaves ML, Rabinowitz JD: Metabolomics in systems microbiology. Curr Opin Biotechnol. 2011, 22: 17-25. 10.1016/j.copbio.2010.10.001. Tai E, Tan M, Stevens R, Low Y, Muehlbauer M, Goh D, Ilkayeva O, Wenner B, Bain J, Lee J, Lim S, Khoo C, Shah S, Newgard C: Insulin resistance is associated with a metabolic profile of altered protein metabolism in Chinese and Asian-Indian men. Diabetologia. 2010, 53: 757-767. 10.1007/s00125-009-1637-8. Kwon YKI, Higgins MB, Rabinowitz JD: Antifolate-induced depletion of intracellular glycine and purines inhibits thymineless death in E. coli. ACS Chem Biol. 2010, 5: 787-795. 10.1021/cb100096f. Xia J, Psychogios N, Young N, Wishart DS: MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Res. 2009, 37: W652-W660. 10.1093/nar/gkp356. Creek DJ, Jankevics A, Burgess KEV, Breitling R, Barrett MP: IDEOM: an Excel interface for analysis of LC–MS-based metabolomics data. Bioinformatics. 2012, 28: 1048-1049. 10.1093/bioinformatics/bts069. Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G: XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem. 2006, 78: 779-787. 10.1021/ac051437y. Melamud E, Vastag L, Rabinowitz JD: Metabolomic analysis and visualization engine for LC − MS data. Anal Chem. 2010, 82: 9818-9826. 10.1021/ac1021166. Kono N, Arakawa K, Ogawa R, Kido N, Oshita K, Ikegami K, Tamaki S, Tomita M: Pathway Projector: web-based zoomable pathway browser using KEGG Atlas and Google Maps API. PLoS One. 2009, 4: e7710-10.1371/journal.pone.0007710. Boccard J, Veuthey JL, Rudaz S: Knowledge discovery in metabolomics: an overview of MS data handling. J Sep Sci. 2010, 33: 290-304. 10.1002/jssc.200900609. Oberg L, Mahoney DH: Linear mixed effects models. Topics in Biostatistics. Edited by: Ambrosius WT. 2007, Totowa, NJ: Humana Press, 213-234. Wolfinger RD, Gibson G: Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol. 2001, 8: 625-637. 10.1089/106652701753307520. Yang YH, Dudoit S, Luu P, Speed TP: Normalization for cDNA microarry data. SPIE Proceedings. 2001, 4266: 141-152. Berger MPF, Passos VL, Tan FES, Winkens B: Optimal designs for one- and two-color microarrays using mixed models: a comparative evaluation of their efficiencies. J Comput Biol. 2009, 16: 67-83. 10.1089/cmb.2008.0048. Chu T-M, Weir B, Weir , Wolfinger R: A systematic statistical linear modeling approach to oligonucleotide array experiments. Math Biosci. 2002, 176: 35-51. 10.1016/S0025-5564(01)00107-9. Demirkale CY, Nettleton D, Maiti T: Linear mixed model selection for false discovery rate control in microarray data analysis. Biometrics. 2010, 66: 621-629. 10.1111/j.1541-0420.2009.01286.x. Haldermans P, Shkedy Z, Van Sanden S, Burzykowski T, Aerts M: Using linear mixed models for normalization of cDNA microarrays. Stat Appl Genet Mol Biol. 2007, 6: Li H, Wood C, Getchell T, Getchell M, Stromberg A: Analysis of oligonucleotide array experiments with repeated measures using mixed models. BMC Bioinforma. 2004, 5: 209-10.1186/1471-2105-5-209. Wang L, Zhang B, Wolfinger RD, Chen X: An integrated approach for the analysis of biological pathways using mixed models. PLoS Genetics. 2008, 4: e1000115-10.1371/journal.pgen.1000115. Urs S, Smith C, Campbell B, Saxton AM, Taylor J, Zhang B, Snoddy J, Jones Voy B, Moustaid-Moussa N: Gene expression profiling in human preadipocytes and adipocytes by microarray analysis. J Nutr. 2004, 134: 762-770. Wernisch L, Kendall SL, Soneji S, Wietzorrek A, Parish T, Hinds J, Butcher PD, Stoker NG: Analysis of whole-genome microarray replicates using mixed models. Bioinformatics. 2003, 19: 53-61. 10.1093/bioinformatics/19.1.53. Smyth GK, Speed T: Normalization of cDNA microarray data. Methods. 2003, 31: 265-273. 10.1016/S1046-2023(03)00155-5. Du P, Kibbe WA, Lin SM: lumi: a pipeline for processing Illumina microarray. Bioinformatics. 2008, 24: 1547-1548. 10.1093/bioinformatics/btn224. Verzani J: An introduction to gWidgets. R News. 2007, 7: 26-33. Bates D, Maechler M, Bolker B: lme4: Linear mixed-effects models using S4 classes. 2011, [http://CRAN.R-project.org/package=lme4], R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing. 2011, Vienna, Austria: R Foundation for Statistical Computing, [http://www.R-project.org/], Noguchi K, Hui WW, Gel YR, Gastwirth JL, Miao W: lawstat: An R package for biostatistics, public policy, and law. 2009, [http://CRAN.R-project.org/package=lawstat], Storey JD: A Direct approach to false discovery rates. Journal of the Royal Statistical Society B. 2002, 64: 479-498. 10.1111/1467-9868.00346. Saxton AM: A macro for converting mean separation output to letter groupings in Proc Mixed. 1996, Nashville: Proceedings, 23rd SAS Users Group International: 22-25 March 1998, 1243-1246. Collier JJ, Burke SJ, Eisenhauer ME, Lu D, Sapp RC, Frydman CJ, Campagna SR: Pancreatic β-cell death in response to pro-inflammatory cytokines Is distinct from genuine apoptosis. PLoS One. 2011, 6: e22485-10.1371/journal.pone.0022485. Bajad SU, Lu W, Kimball EH, Yuan J, Peterson C, Rabinowitz JD: Separation and quantitation of water soluble cellular metabolites by hydrophilic interaction chromatography-tandem mass spectrometry. Journal of Chromatography A. 2006, 1125: 76-88. 10.1016/j.chroma.2006.05.019. Waters CM, Lu W, Rabinowitz JD, Bassler BL: Quorum sensing controls biofilm formation in Vibrio cholerae through modulation of cyclic di-GMP levels and repression of vpsT. J Bacteriol. 2008, 190: 2527-2536. 10.1128/JB.01756-07. Dupont J, Tesseraud S, Derouet M, Collin A, Rideau N, Crochet S, Godet E, Cailleau-Audouin E, Metayer-Coustard S, Duclos MJ, Gespach C, Porter TE, Cogburn LA, Simon J: Insulin immuno-neutralization in chicken: effects on insulin signaling and gene expression in liver and muscle. J Endocrinol. 2008, 197: 531-542. 10.1677/JOE-08-0055. Ji B, Ernest B, Gooding J, Das S, Saxton A, Simon J, Dupont J, Metayer-Coustard S, Campagna S, Voy B: Transcriptomic and metabolomic profiling of chicken adipose tissue in response to insulin neutralization and fasting. BMC Genomics. 2012, 13: 441-10.1186/1471-2164-13-441. Warnes GR: gplots: Various R programming tools for plotting data. 2012, [http://CRAN.R-project.org/package=gplots],