Decision tree supported substructure prediction of metabolites from GC-MS profiles

Jan Hummel1, Nadine Strehmel1, Joachim Selbig2, Dirk Walther1, Joachim Kopka1
1Department Prof. L. Willmitzer, Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, 14476, Potsdam-Golm, Germany
2Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Strasse 24-25, Haus 20, 14476, Potsdam-Golm, Germany

Tóm tắt

Từ khóa


Tài liệu tham khảo

Crawford, L. R., & Morrison, J. D. (1968). Computer methods in analytical mass spectrometry. Identification of an unknown compound in a catalog. Analytical Chemistry, 40, 1464–1469.

Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.

Feldman, H. J., Dumontier, M., Ling, S., Haider, N., & Hogue, C. W. V. (2005). CO: A chemical ontology for identification of functional groups and semantic comparison of small molecules. FEBS Letters, 579, 4685–4691.

Halket, J. M., Waterman, D., Przyborowska, A. M., Patel, R. K. P., Fraser, P. D., & Bramley, P. M. (2005). Chemical derivatization and mass spectral libraries in metabolic profiling by GC/MS and LC/MS/MS. Journal of Experimental Botany, 56, 219–243.

Hummel, J., Selbig, J., Walther, D., & Kopka, J. (2008). The Golm Metabolome Database: A database for GC-MS based metabolite profiling. In J. Nielsen & M. Jewett (Eds.), Metabolomics a powerful tool in systems biology. Topics in current genetics Vol. 18 (pp. 75–96). Berlin, Heidelberg, New York: Springer.

Kopka, J. (2006). Current challenges and developments in GC-MS based metabolite profiling technology. Journal of Biotechnology, 124, 312–322.

Kopka, J., Schauer, N., Krueger, S., Birkemeyer, C., Usadel, B., Bergmuller, E., et al. (2005). [email protected]: The Golm Metabolome Database. Bioinformatics, 21, 1635–1638.

Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2006). Data preprocessing for supervised learning. International Journal of Computer Science, 1, 111–117.

Lisec, J., Schauer, N., Kopka, J., Willmitzer, L., & Fernie, A. R. (2006). Gas chromatography mass spectrometry-based metabolite profiling in plants. Nature Protocols, 1, 387–396.

Luedemann, A., Strassburg, K., Erban, A., & Kopka, J. (2008). TagFinder for the quantitative analysis of gas chromatography-mass spectrometry (GC-MS)-based metabolite profiling experiments. Bioinformatics, 24, 732–737.

Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica Biophysica Acta, 405, 442–451.

Schauer, N., Steinhauser, D., Strelkov, S., Schomburg, D., Allison, G., Moritz, T., et al. (2005). GC-MS libraries for the rapid identification of metabolites in complex biological samples. FEBS Letters, 579, 1332–1337.

Stein, S. E. (1999). An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. Journal of the American Society for Mass Spectrometry, 10, 770–781.

Stein, S. E., & Scott, D. R. (1994). Optimization and testing of mass spectral library search algorithms for compound identification. Journal of the American Society for Mass Spectrometry, 5, 859–866.

Steiner, F. M., Schlick-Steiner, B. C., Nikiforov, A., Kalb, R., & Mistrik, R. (2002). Cuticular hydrocarbons of Tetramorium ants from central Europe: Analysis of GC-MS data with self-organizing maps (SOM) and implications for systematics. Journal of Chemical Ecology, 28, 2569–2584.

Strehmel, N., Hummel, J., Erban, A., Strassburg, K., & Kopka, J. (2008). Retention index thresholds for compound matching in GC-MS metabolite profiling. Journal of Chromatography B, 871, 182–190.

Sumner, L., Amberg, A., Barrett, D., Beale, M., Beger, R., Daykin, C., et al. (2007). Proposed minimum reporting standards for chemical analysis. Metabolomics, 3, 211–221.

Tang, Y., Liang, Y., & Fang, K. T. (2003). Data mining in chemometrics: Sub-structures learning via peak combinations searching in mass spectra. Journal of Data Science, 1, 481–496.

van Rijsbergen, C. J. (1979). Information retrieval. Newton, MA: Butterworth-Heinemann.

Varmuza, K. (2001). From MS data via chemometrics to chemical structure information. Informatics and mass spectrometry. In 13th Sanibel conference on mass spectrometry. American Society for Mass Spectrometry, Sanibel Island, FL, USA, pp. 1–11.

Varmuza, K., & Werther, W. (1996). Mass spectral classifiers for supporting systematic structure elucidation. Journal of Chemical Information and Computer Sciences, 36, 323–333.

Wagner, C., Sefkow, M., & Kopka, J. (2003). Construction and application of a mass spectral and retention time index database generated from plant GC/EI-TOF-MS metabolite profiles. Phytochemistry, 62, 887–900.

Werther, W., Lohninger, H., Stancl, F., & Varmuza, K. (1994). Classification of mass spectra: A comparison of yes/no classification methods for the recognition of simple structural properties. Chemometrics and Intelligent Laboratory Systems, 22, 63–76.

Xu, C. J., He, P., & Liang, Y. Z. (2003). Building an honest tree for mass spectra classification based on prior logarithm normal distribution. Journal of Data Science, 1, 497–509.

Yoshida, H., Leardi, R., Funatsu, K., & Varmuza, K. (2001). Feature selection by genetic algorithms for mass spectral classifiers. Analytica Chimica Acta, 446, 483–492.