xMSanalyzer: automated pipeline for improved feature detection and downstream analysis of large-scale, non-targeted metabolomics data

BMC Bioinformatics - Tập 14 Số 1 - 2013
Karan Uppal1,2, Quinlyn A. Soltow3, Frederick H. Strobel4, W. Stephen Pittard1, Kim M. Gernert1, Tianwei Yu5, Dean P. Jones3
1BimCore, School of Medicine, Emory University, Atlanta, USA
2School of Biology, Georgia Institute of Technology, Atlanta, USA
3Department of Medicine, Division of Pulmonary, Allergy and Critical Care, Emory University, Atlanta, USA
4Mass Spectrometry Center, Emory University, Atlanta, USA
5Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, USA

Tóm tắt

Abstract Background

Detection of low abundance metabolites is important for de novo mapping of metabolic pathways related to diet, microbiome or environmental exposures. Multiple algorithms are available to extract m/z features from liquid chromatography-mass spectral data in a conservative manner, which tends to preclude detection of low abundance chemicals and chemicals found in small subsets of samples. The present study provides software to enhance such algorithms for feature detection, quality assessment, and annotation.

Results

xMSanalyzer is a set of utilities for automated processing of metabolomics data. The utilites can be classified into four main modules to: 1) improve feature detection for replicate analyses by systematic re-extraction with multiple parameter settings and data merger to optimize the balance between sensitivity and reliability, 2) evaluate sample quality and feature consistency, 3) detect feature overlap between datasets, and 4) characterize high-resolution m/z matches to small molecule metabolites and biological pathways using multiple chemical databases. The package was tested with plasma samples and shown to more than double the number of features extracted while improving quantitative reliability of detection. MS/MS analysis of a random subset of peaks that were exclusively detected using xMSanalyzer confirmed that the optimization scheme improves detection of real metabolites.

Conclusions

xMSanalyzer is a package of utilities for data extraction, quality control assessment, detection of overlapping and unique metabolites in multiple datasets, and batch annotation of metabolites. The program was designed to integrate with existing packages such as apLCMS and XCMS, but the framework can also be used to enhance data extraction for other LC/MS data software.

Từ khóa


Tài liệu tham khảo

Zhang H, Wang Y, Gu X, Zhou J, Yan C: Metabolomic profiling of human plasma in pancreatic cancer using pressurized capillary electrochromatography. Electrophoresis 2011,32(3-4):340-347.

Dettmer K, Aronov PA, Hammock BD: Mass spectrometry-based metabolomics. Mass Spectrom Rev 2007,26(1):51-78. 10.1002/mas.20108

Nobeli I, Thornton JM: A bioinformatician’s view of the metabolome. Bioessays 2006,28(5):534-545. 10.1002/bies.20414

Johnson JM, Strobel FH, Reed M, Pohl J, Jones DP: A rapid LC-FTMS method for the analysis of cysteine, cystine and cysteine/cystine steady-state redox potential in human plasma. Clin Chim Acta 2008,396(1-2):43-48.

Podwojski K, Fritsch A, Chamrad DC, Paul W, Sitek B, Stuhler K, Mutzel P, Stephan C, Meyer HE, Urfer W, et al.: Retention time alignment algorithms for LC/MS data must consider non-linear shifts. Bioinformatics 2009,25(6):758-764. 10.1093/bioinformatics/btp052

Zhang R, Barton A, Brittenden J, Huang JT-J, Crowther D: Evaluation for computational platforms of LC-MS based label-free quantitative proteomics: A global view. J Proteomics Bioinformatics 2010,3(9):260-265. 10.4172/jpb.1000149

Saltelli A, Ratto M, Tarantola S, Campolongo F: Sensitivity analysis for chemical models. Chem Rev 2005,105(7):2811-2828. 10.1021/cr040659d

Lange E, Tautenhahn R, Neumann S, Gropl C: Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements. BMC Bioinforma 2008, 9: 375. 10.1186/1471-2105-9-375

Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G: XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 2006,78(3):779-787. 10.1021/ac051437y

Bellew M, Coram M, Fitzgibbon M, Igra M, Randolph T, Wang P, May D, Eng J, Fang R, Lin C, et al.: A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS. Bioinformatics 2006,22(15):1902-1909. 10.1093/bioinformatics/btl276

Katajamaa M, Miettinen J, Oresic M: MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 2006,22(5):634-636. 10.1093/bioinformatics/btk039

Pluskal T, Castillo S, Villar-Briones A, Oresic M: MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinforma 2010, 11: 395. 10.1186/1471-2105-11-395

Sturm M, Bertsch A, Gropl C, Hildebrandt A, Hussong R, Lange E, Pfeifer N, Schulz-Trieglaff O, Zerck A, Reinert K, et al.: OpenMS - an open-source software framework for mass spectrometry. BMC Bioinforma 2008, 9: 163. 10.1186/1471-2105-9-163

Bertsch A, Gropl C, Reinert K, Kohlbacher O: OpenMS and TOPP: open source software for LC-MS data analysis. Methods Mol Biol 2011, 696: 353-367. 10.1007/978-1-60761-987-1_23

Li XJ, Yi EC, Kemp CJ, Zhang H, Aebersold R: A software suite for the generation and comparison of peptide arrays from sets of data collected by liquid chromatography-mass spectrometry. Mol Cell Proteomics 2005,4(9):1328-1340. 10.1074/mcp.M500141-MCP200

Zhang X, Asara JM, Adamec J, Ouzzani M, Elmagarmid AK: Data pre-processing in liquid chromatography-mass spectrometry-based proteomics. Bioinformatics 2005,21(21):4054-4059. 10.1093/bioinformatics/bti660

Yu T, Park Y, Johnson JM, Jones DP: apLCMS-adaptive processing of high-resolution LC/MS data. Bioinformatics 2009,25(15):1930-1936. 10.1093/bioinformatics/btp291

Kim SB, Chen VC, Park Y, Ziegler TR, Jones DP: Controlling the false discovery rate for feature selection in high-resolution NMR spectra. Stat Anal Data Min 2008,1(2):57-66. 10.1002/sam.10005

Jones DP, Park Y, Gletsu-Miller N, Liang Y, Yu T, Accardi CJ, Ziegler TR: Dietary sulfur amino acid effects on fasting plasma cysteine/cystine redox potential in humans. Nutrition 2011,27(2):199-205. 10.1016/j.nut.2010.01.014

Soltow QA, Strobel FH, Mansfield KG, Wachtman L, Park Y, Jones DP: High-performance metabolic profiling with dual chromatography-Fourier-transform mass spectrometry (DC-FTMS) for study of the exposome. Metabolomics 2011. 10.1007/s11306-011-0332-1

Johnson JM, Yu T, Strobel FH, Jones DP: A practical approach to detect unique metabolic patterns for personalized medicine. Analyst 2010,135(11):2864-2870. 10.1039/c0an00333f

Cui Q, Lewis IA, Hegeman AD, Anderson ME, Li J, Schulte CF, Westler WM, Eghbalnia HR, Sussman MR, Markley JL: Metabolite identification via the Madison Metabolomics Consortium Database. Nat Biotechnol 2008,26(2):162-164. 10.1038/nbt0208-162

Smith CA, O’Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G: METLIN: a metabolite mass spectral database. Ther Drug Monit 2005,27(6):747-751. 10.1097/01.ftd.0000179845.53213.39

Tobias K, Oliver F: Metabolomic database annotations via query of elemental compositions: mass accuracy is insufficient even at less than 1 ppm. BMC Bioinforma 2006, 7: 234. 10.1186/1471-2105-7-234

Mayampurath AM, Jaitly N, Purvine SO, Monroe ME, Auberry KJ, Adkins JN, Smith RD: DeconMSn: a software tool for accurate parent ion monoisotopic mass determination for tandem mass spectra. Bioinformatics 2008,24(7):1021-1023. 10.1093/bioinformatics/btn063