From data to knowledge: The future of multi-omics data analysis for the rhizosphere

Rhizosphere - Tập 3 - Trang 222-229 - 2017
Richard Allen White1, Mark I. Borkum1, Albert Rivas-Ubach1, Aivett Bilbao1, Jason P. Wendler1, Sean M. Colby1, Martina Köberl2, Christer Jansson1
1Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
2Institute of Environmental Biotechnology, Graz University of Technology, Petersgasse 12, 8010 Graz, Austria

Tài liệu tham khảo

Altschul, 1990, Basic local alignment search tool, J. Mol. Biol., 215, 403, 10.1016/S0022-2836(05)80360-2 Andrews, S., 2017. FastQC. 〈http://www.bioinformatics.babraham.ac.uk/projects/fastqc/〉. Aronesty, E. Command-line Tools for Processing Biological Sequencing Data, ea-utils, Expression Analysis. Durham, NC. Available online at: 〈http://code.google.com/p/ea-utils〉. Asay, 2008, The general public license version 3.0: making or breaking the foss movement, Mich. Telecommun. Technol. Law Rev., 14, 265 Bao, 2014, Metaproteomic identification of diazotrophic methanotrophs and their localization in root tissues of field-grown rice plants, Appl. Environ. Microbiol., 80, 5043, 10.1128/AEM.00969-14 Bersanelli, Matteo, D.R.E.G.C.S.G.C., Mosca, Ettore, Milanesi, L., Methods for the integration of multi-omics data: mathematical aspects. BMC Bioinform. 17(5) http://dx.doi.org/10.1186/s12859-015-0857-9. URL 〈https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0857-9〉. Bilbao, 2016, Dedicated software enhancing data-independent acquisition methods in mass spectrometry, CHIMIA International Journal for Chemistry, 70, 293, 10.2533/chimia.2016.293 Bilbao, 2015, Processing strategies and software solutions for data-independent acquisition in mass spectrometry, Proteomics, 15, 964, 10.1002/pmic.201400323 Bolger, 2014, Trimmomatic: a flexible trimmer for Illumina sequence data, Bioinformatics, 10.1093/bioinformatics/btu170 Bruschi, 2008, HORA suite: a database and software for human metabolomics, Metabolomics, 4, 90, 10.1007/s11306-007-0095-x Buchfink, 2015, Fast and sensitive protein alignment using diamond, Nat. Methods, 12, 59, 10.1038/nmeth.3176 Bulgarelli, 2012, Revealing structure and assembly cues for arabidopsis root-inhabiting bacterial microbiota, Nature, 488, 91, 10.1038/nature11336 Burns, R.G., 2010. Albert Rovira and a half-century of rhizosphere research. In: Proceedings of the Rovira Rhizosphere Symposium. p. 1. Caldwell, 2015, Prokaryotic diversity in the rhizosphere of organic, intensive, and transitional coffee farms in brazil, PLoS One, 10, e0106355, 10.1371/journal.pone.0106355 Chen, 2016, IMG/M: integrated genome and metagenome comparative data analysis system, Nucleic Acids Res. Chevreux, 2004, Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs, Genome Res., 14, 1147, 10.1101/gr.1917404 Cox, 2008, Maxquant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification, Nat. Biotechnol., 26, 1367, 10.1038/nbt.1511 Cox, 2011, Andromeda: a peptide search engine integrated into the maxquant environment, J. Proteome Res., 10, 1794, 10.1021/pr101065j Craig, 2004, TANDEM: matching proteins with tandem mass spectra, Bioinformatics, 20, 1466, 10.1093/bioinformatics/bth092 Crick, F.H., 1958. On protein synthesis. In: Sanders, F. (Ed.), Symposia of the Society for Experimental Biology, Number XII: The Biological Replication of Macromolecules. Cambridge University Press, pp. 138–163. Crusoe, M.R., Alameldin, H.F., Awad, S., Boucher, E., Caldwell, A., Cartwright, R., Charbonneau, A., Constantinides, B., Edvenson, G., Fay, S., et al. The khmer Software Package: Enabling Efficient Nucleotide Sequence Analysis. F1000Research 4. D.J.G. Institute 2017. BBTools. 〈http://jgi.doe.gov/data-and-tools/bbtools/〉. Delaney, N., 2017. Cafe-quality: Tools to Assess and Diagnose Accuracy Issues in PacBio Data. Available at: 〈https://github.com/evolvedmicrobe/cafe-quality〉. Deutsch, 2010, A guided tour of the trans-proteomic pipeline, Proteomics, 10, 1150, 10.1002/pmic.200900375 Doornbos, 2012, Impact of root exudates and plant defense signaling on bacterial communities in the rhizosphere. A review, Agron. Sustain. Dev., 32, 227, 10.1007/s13593-011-0028-y Egamberdieva, 2008, High incidence of plant growth-stimulating bacteria associated with the rhizosphere of wheat grown on salinated soil in Uzbekistan, Environ. Microbiol., 10, 1 Goodwin, 2016, Coming of age: ten years of next-generation sequencing technologies, Nat. Rev. Genet., 17, 333, 10.1038/nrg.2016.49 Gowda, 2014, Interactive XCMS online: simplifying advanced metabolomic data processing and subsequent statistical analyses, Anal. Chem., 86, 6931, 10.1021/ac500734c Gurevich, 2013, QUAST: quality assessment tool for genome assemblies, Bioinformatics, 29, 1072, 10.1093/bioinformatics/btt086 Hannon, G., 2010. FASTX-Toolkit. 〈http://hannonlab.cshl.edu/fastx_toolkit/〉. Hartmann, 2008, Lorenz Hiltner, a pioneer in rhizosphere microbial ecology and soil bacteriology research, Plant Soil, 312, 7, 10.1007/s11104-007-9514-z Hauswedell, 2014, Lambda: the local aligner for massive biological data, Bioinformatics, 30, i349, 10.1093/bioinformatics/btu439 Heinonen, 2008, FiD: a software for ab initio structural identification of product ions from tandem mass spectrometric data, Rapid Commun. Mass Spectrom., 22, 3043, 10.1002/rcm.3701 Hettich, R.L., Pan, C., Chourey, K., Giannone, R.J., 2013. Metaproteomics: Harnessing the Power of High Performance Mass Spectrometry to Identify the Suite of Proteins That Control Metabolic Activities in Microbial Communities. Horlacher, 2015, MzJava: an open source library for mass spectrometry data processing, J. Proteom., 129, 63, 10.1016/j.jprot.2015.06.013 Huang, 1999, CAP3: a DNA sequence assembly program, Genome Res., 9, 868, 10.1101/gr.9.9.868 Huang, 2016, Integration of string and de Bruijn graphs for genome assembly, Bioinformatics, 32, 1301, 10.1093/bioinformatics/btw011 Hultman, 2015, Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes, Nature, 521, 208, 10.1038/nature14238 Hunt, 2014, A comprehensive evaluation of assembly scaffolding tools, Genome Biol., 15, R42, 10.1186/gb-2014-15-3-r42 Jain, M., Olsen, H.E., Paten, B., Akeson, M., The oxford nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17(1). http://dx.doi.org/10.1186/s13059-016-1103-0. 〈https://doi.org/10.1186%2Fs13059-016-1103-0〉. Jansson, 2011, Towards tera terra: terabase sequencing of terrestrial metagenomics, Microbe Joshi, N., Fass, J., Sickle: A Sliding-window, Adaptive, Quality-based Trimming tool for FastQ Files. Available from: 〈https://github.com/najoshi/sickle〉. Jouhten, 2014, Labelling analysis for 13C MFA using NMR spectroscopy, 143 Kajihata, 2014, OpenMebius: an open source software for isotopically nonstationary 13C-based metabolic flux analysis, BioMed Res. Int., 10.1155/2014/627014 de Keersmaecker, 2006, Integration of omics data: how well does it work for bacteria?, Mol. Microbiol., 62, 1239, 10.1111/j.1365-2958.2006.05453.x Kent, 2002, BLAT–the BLAST-like alignment tool, Genome Res., 12, 656, 10.1101/gr.229202 Kessler, 2013, MeltDB 2.0 – advances of the metabolomics software system, Bioinformatics, 29, 2452, 10.1093/bioinformatics/btt414 Kessner, 2008, ProteoWizard: open source software for rapid proteomics tools development, Bioinformatics, 24, 2534, 10.1093/bioinformatics/btn323 Kim, 2013, Analytical tools and databases for metagenomics in the next-generation sequencing era, Genom. Inform., 11, 102, 10.5808/GI.2013.11.3.102 Kim, 2014, MS-GF+ makes progress towards a universal database search tool for proteomics, Nat. Commun., 5, 5277, 10.1038/ncomms6277 Kim, D., Hahn, A.S., Hanson, N.W., Konwar, K.M., Hallam, S.J., 2014. LAST+: Optimized Threading for Fast Annotation. Available at 〈https://github.com/hallamlab/LAST-Plus〉. Knief, 2012, Metaproteogenomic analysis of microbial communities in the phyllosphere and rhizosphere of rice, ISME J., 6, 1378, 10.1038/ismej.2011.192 Koren, 2015, One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly, Curr. Opin. Microbiol., 23, 110, 10.1016/j.mib.2014.11.014 Koslicki, 2016, MetaPalette: a k-mer painting approach for metagenomic taxonomic profiling and quantification of novel strain variation, mSystems, 1, 10.1128/mSystems.00020-16 Kultima, 2016, MOCAT2: a metagenomic assembly, annotation and profiling framework, Bioinformatics, 32, 2520, 10.1093/bioinformatics/btw183 Langmead, 2012, Fast gapped-read alignment with bowtie 2, Nat. Methods, 9, 357, 10.1038/nmeth.1923 Larsen, P.E., Sreedasyam, A., Trivedi, G., Desai, S., Dai, Y., Cseke, L.J., Collart, F.R., Multi-omics approach identifies molecular mechanisms of plant-fungus mycorrhizal interaction. Front. Plant Sci. 2016. http://dx.doi.org/10.3389/fpls.2015.01061. URL 〈http://doi.org/10.3389%2Ffpls.2015.01061〉. Li, 2015, MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de bruijn graph, Bioinformatics Li, 2016, MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices, Methods, 102, 3, 10.1016/j.ymeth.2016.02.020 Li, 2009, Fast and accurate short read alignment with burrows-wheeler transform, Bioinformatics, 25, 1754, 10.1093/bioinformatics/btp324 Li, Z., Wang, Y., Yao, Q., Justice, N.B., Ahn, T.-H., Xu, D., Hettich, R.L., Banfield, J.F., Pan, C. Diverse and divergent protein post-translational modifications in two growth stages of a natural microbial community. Nat. Commun. 5. Lin, 2016, Assembly of long error-prone reads using de Bruijn graphs, Proc. Natl. Acad. Sci. USA, E8396, 10.1073/pnas.1604560113 Lingner, 2011, CoMet-a web server for comparative functional profiling of metagenomes, Nucleic Acids Res., 10.1093/nar/gkr388 Luo, 2012, SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler, Gigascience, 1, 18, 10.1186/2047-217X-1-18 MacLean, 2010, Skyline: an open source document editor for creating and analyzing targeted proteomics experiments, Bioinformatics, 26, 966, 10.1093/bioinformatics/btq054 Marçais, 2011, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers, Bioinformatics, 27, 764, 10.1093/bioinformatics/btr011 Mardis, 2013, Next-generation sequencing platforms, Ann. Rev. Anal. Chem., 6, 287, 10.1146/annurev-anchem-062012-092628 Marschner, 2002, Spatial and temporal dynamics of the microbial community structure in the rhizosphere of cluster roots of white lupin (Lupinus albus L.), Plant Soil, 246, 167, 10.1023/A:1020663909890 Melsted, 2011, Efficient counting of k-mers in DNA sequences using a bloom filter, BMC Bioinform., 12, 333, 10.1186/1471-2105-12-333 Mendes, 2014, Taxonomical and functional microbial community selection in soybean rhizosphere, ISME J., 8, 1577, 10.1038/ismej.2014.17 Mendes, 2011, Deciphering the rhizosphere microbiome for disease-suppressive bacteria, Science, 332, 1097, 10.1126/science.1203980 Menikarachchi, 2012, A software package enabling HPLC/MS-based identification of unknown chemical structures, Anal. Chem., 84, 9388, 10.1021/ac302048x Meyer, 2008, The metagenomics RAST server-a public resource for the automatic phylogenetic and functional analysis of metagenomes, BMC Bioinform., 9, 386, 10.1186/1471-2105-9-386 Mikheenko, 2015, MetaQUAST: evaluation of metagenome assemblies, Bioinformatics Mitchell, 2015, EBI metagenomics in 2016 – an expanding and evolving resource for the analysis and archiving of metagenomic data, Nucleic Acids Res. Morgan, J.A.W., Whipps, J.M., 2000. Methodological approaches to the study of rhizosphere carbon flow and microbial population dynamics. The Rhizosphere: Biochemistry and Organic Substance at the Soil-Plant Interface: Biochemistry and Organic Substance at the Soil-Plant Interface. p. 373. Mukherjee, 2015, Large-scale contamination of microbial isolate genomes by illumina PhiX control, Stand. Genom. Sci., 10, 18, 10.1186/1944-3277-10-18 Murray, 2012, The methylomes of six bacteria, Nucleic Acids Res., 40, 11450, 10.1093/nar/gks891 Narayanasamy, S., Jarosz, Y., Muller, E.E.L., Heintz-Buschart, A., Herold, M., Kaysen, A., Laczny, C.C., Pinel, N., May, P., Wilmes, P. IMP: a pipeline for reproducible reference-independent integrated metagenomic and metatranscriptomic analyses. Genome Biol. 17(1), 2016. http://dx.doi.org/10.1186/s13059-016-1116-8. 〈https://doi.org/10.1186%2Fs13059-016-1116-8〉. Nesvizhskii, 2010, A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics, J. Proteom., 73, 2092, 10.1016/j.jprot.2010.08.009 Newman, 2016, Changes in rhizosphere bacterial gene expression following glyphosate treatment, Sci. Total Environ., 553, 32, 10.1016/j.scitotenv.2016.02.078 Nilsson, 2016, Simultaneous tracing of carbon and nitrogen isotopes in human cells, Mol. BioSyst., 12, 1929, 10.1039/C6MB00009F Ounit, 2015, CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers, BMC Genom., 16, 236, 10.1186/s12864-015-1419-2 Peiffer, 2013, Diversity and heritability of the maize rhizosphere microbiome under field conditions, Proc. Natl. Acad. Sci. USA, 110, 6548, 10.1073/pnas.1302837110 Peng, 2012, IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth, Bioinformatics, 28, 1420, 10.1093/bioinformatics/bts174 Prosser, 2015, Dispersing misconceptions and identifying opportunities for the use of 'omics' in soil microbial ecology, Nat. Rev. Microbiol., 13, 439, 10.1038/nrmicro3468 QIBEBT Single-Cell Center Bioinformatics Group, Parallel-QC, 2017. 〈http://www.computationalbioenergy.org/parallel-qc.html〉. Quek, 2009, OpenFLUX: efficient modelling software for 13C-based metabolic flux analysis, Microb. Cell Fact., 8, 1, 10.1186/1475-2859-8-25 Robinette, 2008, Web server based complex mixture analysis by NMR, Anal. Chem., 80, 3606, 10.1021/ac702530t Rognes, 2016, VSEARCH: a versatile open source tool for metagenomics, PeerJ, 4, e2584, 10.7717/peerj.2584 Rooijers, 2011, An iterative workflow for mining the human intestinal metaproteome, BMC Genom., 12, 6, 10.1186/1471-2164-12-6 Rost, 2016, Openms: a flexible open-source software platform for mass spectrometry data analysis, Nat. Methods, 13, 741, 10.1038/nmeth.3959 Roudier, 2015, The rise of information science: a changing landscape for soil science, IOP Conf. Ser.: Earth Environ. Sci., 25, 012023, 10.1088/1755-1315/25/1/012023 Sangwan, N., Xia, F., Gilbert, J.A. Recovering complete and draft population genomes from metagenome datasets. Microbiome. 4(1), 2016, http://dx.doi.org/10.1186/s40168-016-0154-5 URL 〈https://doi.org/10.1186%2Fs40168-016-0154-5〉. Schmieder, 2011, Fast identification and removal of sequence contamination from genomic and metagenomic datasets, PLoS One, 6, e17288, 10.1371/journal.pone.0017288 Schmieder, 2011, Quality control and preprocessing of metagenomic datasets, Bioinformatics, 27, 863, 10.1093/bioinformatics/btr026 Schneider, 2012, Who is who in litter decomposition? Metaproteomics reveals major microbial players and their biogeochemical functions, ISME J., 6, 1749, 10.1038/ismej.2012.11 Shi, 2015, Successional trajectories of rhizosphere bacterial communities over consecutive seasons, mBio, 6, 10.1128/mBio.00746-15 Shi, 2012, Antibody-free, targeted mass-spectrometric approach for quantification of proteins at low picogram per milliliter levels in human plasma/serum, Proc. Natl. Acad. Sci. USA, 109, 15395, 10.1073/pnas.1204366109 Shupletsov, 2014, OpenFLUX2: 13C-MFA modeling software package adjusted for the comprehensive analysis of single and parallel labeling experiments, Microb. Cell Fact., 13, 152 Sović, I., Šikić, M., Wilm, A., Fenlon, S.N., Chen, S., Nagarajan, N. Fast and sensitive mapping of nanopore sequencing reads with graphmap. Nat. Commun. 7. Steinegger, 2016, Sensitive protein sequence searching for the analysis of massive data sets, bioRxiv, 079681 Tarraga, 2016, HPG pore: an efficient and scalable framework for nanopore sequencing data, BMC Bioinform., 17, 107, 10.1186/s12859-016-0966-0 Treangen, 2013, MetAMOS: a modular and open source metagenomic assembly and analysis pipeline, Genome Biol., 14, 10.1186/gb-2013-14-1-r2 Turner, 2013, Comparative metatranscriptomics reveals kingdom level changes in the rhizosphere microbiome of plants, ISME J., 7, 2248, 10.1038/ismej.2013.119 van Dam, 2016, Metabolomics in the rhizosphere: tapping into belowground chemical communication, Trends Plant Sci., 21, 256, 10.1016/j.tplants.2016.01.008 VandenBygaart, 2011, Experiment design to achieve desired statistical power, Can. J. Soil Sci., 91, 309, 10.4141/cjss2010-068 Vaser, 2016, Fast and accurate de novo genome assembly from long uncorrected reads, bioRxiv, 068122 Vaser, 2016, Sword – a highly efficient protein database search, Bioinformatics, 32, i680, 10.1093/bioinformatics/btw445 Venter, 2004, Environmental genome shotgun sequencing of the sargasso sea, Science, 304, 66, 10.1126/science.1093857 Walker, 2016, Unambiguous metabolite identification in high-throughput metabolomics by hybrid 1D 1H NMR/ESI MS1 approach, Magn. Reson. Chem., 54, 998, 10.1002/mrc.4503 White, 2013, Draft genome sequence of Exiguobacterium pavilionensis strain RW-2, with wide thermal, salinity, and pH tolerance, isolated from modern freshwater microbialites, Genome Announc., 1, 10.1128/genomeA.00597-13 White, 2016, Moleculo long-read sequencing facilitates assembly and genomic binning from complex soil metagenomes, mSystems, 1, 10.1128/mSystems.00045-16 White, 2016, The past, present and future of microbiome analyses, Nat. Protoc., 11, 2049, 10.1038/nprot.2016.148 White R.A., III., Brown, J., Colby, S., Overall, C.C., Lee, J.-Y., Zucker, J., Glaesemann, K.R., Jansson, C., Jansson, J.K., 2017b. Atlas (Automatic Tool for Local Assembly Structures) – A Comprehensive Infrastructure for Assembly, Annotation, and Genomic Binning of Metagenomic and Metatranscriptomic Data. http://dx.doi.org/10.7287/peerj.preprints.2843v1. 〈https://doi.org/10.7287/peerj.preprints.2843v1〉. White R.A., III., Panyala, A., Glass, K., Colby, S., Glaesemann, K.R., Jansson, C., Jansson, J.K., 2017a. Mercat: A Versatile k-mer Counter and Diversity Estimator for Database-independent Property Analysis Obtained from Metagenomic and/or Metatranscriptomic Sequencing Data. http://dx.doi.org/10.7287/peerj.preprints.2825v1 〈https://doi.org/10.7287/peerj.preprints.2825v1〉. Wilmes, 2015, A decade of metaproteomics: where we stand and what the future holds, Proteomics, 15, 3409, 10.1002/pmic.201500183 Wood, 2014, Kraken: ultrafast metagenomic sequence classification using exact alignments, Genome Biol., 15, R46, 10.1186/gb-2014-15-3-r46 Worley, 2014, MVAPACK: a complete data handling package for NMR metabolomics, ACS Chem. Biol., 9, 1138, 10.1021/cb4008937 Xia, 2008, MetaboMiner – semi-automated identification of metabolites from 2D NMR spectra of complex biofluids, BMC Bioinform., 9, 507, 10.1186/1471-2105-9-507 Xia, 2009, MetaboAnalyst: a web server for metabolomic data analysis and interpretation, Nucleic Acids Res., 37, W652, 10.1093/nar/gkp356 Zampieri, E., Chiapello, M., Daghino, S., Bonfante, P., Mello, A. Soil metaproteomics reveals an inter-kingdom stress response to the presence of black truffles. Sci. Rep. 2016. Zerbino, 2008, Velvet: algorithms for de novo short read assembly using de bruijn graphs, Genome Res., 18, 821, 10.1101/gr.074492.107 Zhang, 2011, Comprehensive analysis of protein modifications by top-down mass spectrometry, Circ.: Cardiovasc. Genet., 4, 711 Zhang, 2014, These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure, PLoS One, 9, e101271, 10.1371/journal.pone.0101271 Zhang, W., Sun, J., Cao, H., Tian, R., Cai, L., Ding, W., Qian, P.-Y., Post-translational modifications are enriched within protein functional groups important to bacterial adaptation within a deep-sea hydrothermal vent environment. Microbiome 4(1), 2016. http://dx.doi.org/10.1186/s40168-016-0194-x URL 〈https://doi.org/10.1186%2Fs40168-016-0194-x〉. Zhao, 2012, Rapsearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data, Bioinformatics, 28, 125, 10.1093/bioinformatics/btr595 Zhou, 2013, Qc-chain: fast and holistic quality control method for next-generation sequencing data, PLoS One, 8, e60234, 10.1371/journal.pone.0060234 Zhou, 2014, Meta-QC-chain: comprehensive and fast quality control method for metagenomic data, Genom. Proteom. Bioinform., 12, 52, 10.1016/j.gpb.2014.01.002 Zimin, 2013, The MaSuRCA genome assembler, Bioinformatics, 29, 2669, 10.1093/bioinformatics/btt476